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Abstract 

Concerning bivariate least squares linear regression, the classical re- 
sults obtained for extreme structural models in earlier attempts (Isobe 
et al., 1990; Feigelson and Babu, 1992) are reviewed using a new for- 
malism in terms of deviation (matrix) traces which, for homoscedas- 
tic data, reduce to usual quantities leaving aside an unessential (but 
dimensional) multiplicative factor. Within the framework of classi- 
cal error models, the dependent variable relates to the independent 
variable according to the usual additive model. The classes of linear 
models considered are regression lines in the limit of uncorrelated er- 
rors in X and in Y. The following models are considered in detail 
(Y) errors in X negligible (ideally null) with respect to errors in Y 
(X) errors in Y negligible (ideally null) with respect to errors in X 
(C) oblique regression; (O) orthogonal regression; (R) reduced major- 
axis regression; (B) bisector regression. For homoscedastic data, the 
results are taken from earlier attempts and rewritten using a more 
compact notation. For heteroscedastic data, the results are inferred 
from a procedure related to functional models (York, 1966; Caimmi, 
2011). An example of astronomical application is considered, concern- 
ing the [0/H]-[Fe/H] empirical relations deduced from five samples 
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related to different stars and/or different methods of oxygen abun- 
dance determination. For low-dispersion samples and assigned meth- 
ods, different regression models yield results which are in agreement 
within the errors (to~) for both heteroscedastic and homoscedastic 
data, while the contrary holds for large-dispersion samples. In any 
case, samples related to different methods produce discrepant results, 
due to the presence of (still undetected) systematic errors, which im- 
plies no definitive statement can be made at present. Asymptotic 
expressions approximate regression line slope and intercept variance 
estimators, for normal residuals, to a better extent with respect to 
earlier attempts. Related fractional discrepancies are not exceeding 
a few percent for low-dispersion data, which grows up to about 10% 
for large-dispersion data. An extension of the formalism to generic 
structural models is left to a forthcoming paper. 
keywords - galaxies: evolution - stars: formation; evolution - methods: 
data analysis - methods: statistical. 

pacs codes: 98.62.-g; 97.10. Cv; 02.50.-r 

1 Introduction 

Linear regression is a fundamental and frequently used statistical tool in 
almost all branches of science, among which astronomy. The related prob- 
lem is twofold: regression line slope and intercept estimators are expressed 
involving minimizing or maximizing some function of the data; on the other 
hand, regression line slope and intercept variance estimators are expressed 
requiring knowledge of the error distributions of the data. The complex- 
ity mainly arises from the occurrence of intrinsic dispersion in addition to 
the dispersion related to the measurement processes (hereafter quoted as in- 
strumental dispersion), where the distribution corresponding to the former 
can be different from the distribution corresponding to the latter i.e. non 
Gaussian (non normal). 

In statistics, problems where the true points lie precisely on an expected 
line are called functional regression models, while problems where the true 
points are (intrinsically) scattered about an expected line are called struc- 
tural regression models. Accordingly, functional regression models may be 
conceived as structural regression models where the intrinsic dispersion is 
negligible (ideally null) with respect to the instrumental dispersion. Con- 
versely, structural regression models where the instrumental dispersion is 
negligible (ideally null) with respect to the intrinsic dispersion, can be de- 
fined as extreme structural models (Caimmi, 2011, hereafter quoted as Cll). 
A distinction between functional and structural modelling is currently pre- 
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ferred, where the former can be affected by intrinsic scatter but with no or 
only minimal assumptions on related distributions, while the latter implies 
(usually parametric) models are placed on the above mentioned distribu- 
tions. For further details refer to specific textbooks (e.g., Carroll et al., 2006, 
Chap. 2, §2.1). In addition, models where the instrumental dispersion is the 
same from point to point for each variable, are called homoscedastic models, 
while models where the instrumental dispersion is (in general) different from 
point to point, are called heteroscedastic models. Similarly, related data are 
denoted as homoscedastic and heteroscedastic, respectively. 

Bivariate least squares linear regression related to heteroscedastic func- 
tional models with uncorrelated and correlated errors, following Gaussian 
distributions, were analysed and formulated in two classical papers (York, 
1966; 1969; hereafter quoted as Y66 and Y69, respectively), where regres- 
sion line slope and intercept variance estimators are determined using the 
method of partial differentiation (Y69). On the contrary, the method of 
moments estimator is used to this aim in later attempts [e.g., Fuller, 1987 
(hereafter quoted as F87), Chap. 1, §1.3.2, Eq. (1.3.7) therein; Feigelson and 
Babu, 1992, erratum, 2011, hereafter quoted together as FB92 if not other- 
wise specified]. 

Bivariate least squares linear regression related to extreme structural 
models, where the instrumental dispersion is negligible (ideally null) with 
respect to intrinsic dispersion, was exhaustively treated in two classical pa- 
pers (Isobe et al., 1990, hereafter quoted as Ial90; FB92) and extended to 
generic structural models in a later attempt (Akritas and Bershady, 1996, 
hereafter quoted as AB96). 

The above mentioned papers provide the simplest description of linear 
regression. In reality, biases and additional effects must be taken into con- 
sideration, which implies much more complicated description and formula- 
tion, as it can be seen in specific monographies (e.g., F87; Carroll et al., 
2006; Buonaccorsi, 2010). Restricting to the astronomical literature, a re- 
cent investigation (Kelly, 2007) is particularly relevant in that it is the first 
example (in the field under discussion) where linear regression is considered 
following the modern (since about half a century ago) approach based on 
likelihoods rather than the old (up to about a century ago) least-squares ap- 
proach. More specifically, a hierarchical measurement error model is set up 
therein, the complicated likelihood is written down, and a variety of mini- 
mum least-squares and Bayesan solutions are shown, which can treat func- 
tional, structural, multivariate, truncated and censored mesaurement error 
regression problems. 

Even in dealing with the simplest homoscedastic (or heteroscedastic) func- 
tional and structural models, still no unified analytic formalism has been de- 
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veloped (to the knowledge of the author) where (i) structural heteroscedastic 
models with instrumental and intrinsic dispersion of comparable order in 
both variables, are considered; (ii) previous results are recovered in the limit 
of dominant instrumental dispersion; and (iii) previous results are recovered 
in the limit of dominant intrinsic dispersion. A related formulation may be 
useful also for computational methods, in the sense that both the general 
case and limiting situations can be described by a single numerical code. 

A first step towards a unified analytic formalism of bivariate least squares 
linear regression involving functional models has been performed in an earlier 
attempt (Cll), where the least-squares approach developed in two classical 
papers (Y66; Y69) has been reviewed and reformulated by definition and use 
of deviation (matrix) traces. The current investigation aims at making a sec- 
ond step along the same direction, in dealing with extreme structural models. 
More specifically, the results found in two classical papers (Ial90; FB92) shall 
be reformulated in terms of deviation traces for homoscedastic models, and 
extended to the general case of heteroscedastic models by analogy with their 
counterparts related to functional models, within the framework of classi- 
cal error models where the dependent variable relates to the independent 
variable according to the usual additive model. 

Regression line slope and intercept estimators, and related variance esti- 
mators, are expressed in terms of deviation traces for different homoscedas- 
tic models (Ial90; FB92) in section HI where an extension to corresponding 
heteroscedastic models is also performed and both normal and non normal 
residuals are considered. An example of astronomical application is outlined 
in section [3j The discussion is presented in section HJ Finally, the conclusion 
is shown in section [51 Some points are developed with more detail in the 
Appendix. An extension of the formalism to generic structural models is left 
to a forthcoming paper. 

2 Least-squares fitting of a straight line 

2.1 General considerations 

Attention shall be restricted to the classical problem of least-squares fit- 
ting of a straight line, where both variables are measured with errors and the 
true points are shifted to the actual points outside the unknown regression 
line, due to the intrinsic scatter, in the limit where the instrumental scatter 
is negligible (ideally null) with respect to the intrinsic scatter i.e. extreme 
structural models (e.g., Ial90; FB92). 

In general, the dependent variable, y, relates to the independent variable, 
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x, according to the usual additive model (e.g., AB96; Carroll et al., 2006, 
Chap.l, §1.2, Chap. 3, §3.2.1; Kelly, 2007; Buonaccorsi, 2010, Chap. 4, §4.3): 

ysi = axsi + b + €i ; 1 < i < n ; (1) 

where Pgj = (xsi,ysi) are the actual points whose coordinates are affected by 
no instrumental error and £j is a random variable with null expectation value 
representing the intrinsic scatter in (xsi,ys%) about the regression lin^]. The 
terms "independent variable" and "dependent variable" are purely conven- 
tional when the model is symmetric in x and in y provided a ^ 0. 
The equation of the unknown regression line implies: 

y * = ax* + b ; 1 < i < n ; (2) 

where P* = (x*,y*) are the true points whose coordinates are affected by 
neither instrumental nor intrinsic scatter. 

Due to the occurrence of intrinsic scatter, the actual points must be 
considered in place of the true points. The coordinates of actual and true 
points are assumed to be related as: 

zsi = z* + (£ s Ji ; z = x, y; 1 < i < n ; (3) 

where (£sji, (£,s y )i, are the intrinsic (i.e. due to the intrinsic scatter) errors 
on x*, y*, respectively, assumed to obey specified distributions with null 
expectation values and known variances, [(cr xx )g]j, [(cr TO ) s ] i , and covariance, 

The combination of Eqs. (pQ), ([2]), and (J3]) yields: 

e( = tesy)i ~ a(£s x )i ; (4) 

where the contribution of each variable to the intrinsic scatter is explicitly 
expressed. For the true regression line i.e. fixed slope, a, a null expectation 
value of ei necessarily implies null expectation values of (£sji and (£ss,)i; 
1 < % < n, and vice versa. 

Due to the occurrence of instrumental scatter, the observed points, Pj = 
(Xi,Yi), are evaluated in place of the actual points. The coordinates of 
observed and actual points are assumed to be related as: 

Zi = z Si + (£fJ* ; Z = X, Y ; z = x, y ; 1 < i < n ; (5) 

where (^fJj, (£,F y )i, are the instrumental (i.e. due to the intrumental scat- 
ter) errors on x$i, ysi, respectively, assumed to obey Gaussian distributions 

1 The Italian convenction shall be adopted here, according to which the slope and the 
intercept of a straight line on the Cartesian plane, are denoted as a, b, respectively. 
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with null expectation values and known variances, [(ct^f]*, [(c^f]^ an d 
covariance, [(ct^f]*- 

Due to the occurrence of both instrumental and intrinsic scatter, the 
observed points are evaluated in place of the actual points which, in turn, 
must be considered in place of the true points. The coordinates of observed 
and true points, via Eqs. ([T]), (J2D and (JSJ), are related as: 

Z% = Zi+txi ! Z = X, Y; z = x,y ; 1 < i < n ; (6) 

where the (instrumental + intrinsic) errors, £ Xi , £ yi , are defined as: 

= (frji + (Csji ; z = x,y ] l<i<n ; (7) 

which obey specified distributions with null expectation values and known 
variances, (a xx )i, (<J yy )i, and covariance, (a xy )i. The further restriction that 
(&*.)« feji, 2 = x, y, 1 < i < n, are independent, implies {a xx )i = \{a zz )v)i+ 

l((?zz)s]i, (&xy)i = [(o r xy)F]i + [(&xy)s\i- 

In the case under discussion, the regression estimator minimizes the sum 
(over the n observations) of squared residuals (e.g., Y69), or statistical dis- 
tances of the observed points, P, ; = (Xj,Fj), from the estimated line in the 
unknown parameters, a, b, x\, x n (e.g., F87, Chap. 1, §1.3.3). Under re- 
strictive assumptions, the regression estimator is the functional maximum 
likelihood estimator (e.g., Carroll et al., 2006, Chap. 3, §3.4.2). 

The coordinates, (xi,yi), may be conceived as the adjusted values of re- 
lated observations, (X i: YJ), on the estimated regression line (Y66; Y69) and, 
in addition, as estimators of the coordinates, (x*,y*), on the true regression 
line determined in absence of mesaurement errors. The line of adjustment, 
PjPj (e.g., Y69), may be conceived as an estimator of the statistical distance, 
PiP* (e.g., F87, Chap. 1, §1.3.3), where Pj(xj,?/j) is the adjusted point on the 
estimated regression line: 

yi = axi + b; 1 < i < n ; (8) 

where, in general, estimators are denoted by hats, and P*(x*,y*) is the true 
point on the true regression line, Eq. fl5]). 

To the knowledge of the author, only classical error models are consid- 
ered for astronomical applications, and for this reason different error models 
such as Berkson models and mixture error models (e.g., Carroll et al., 2006, 
Chap. 3, Sect. 3.2) shall not be dealt with in the current attempt. From this 
point on, investigation shall be limited to extreme structural models and 
least-squares regression estimators for the following reasons. First, they are 
important models in their own right, furnishing an approximation to real 
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world situations. Second, a careful examination of these simple models helps 
for understanding the theoretical underpinnings of methods for other models 
of greater complexity such as hierarchical models (e.g., Kelly, 2007). 



2.2 Extreme structural models 

With regard to extreme structural models, bivariate least squares linear 
regression were analysed in two classical papers in the special case of oblique 
regression i.e. constant variance ratio, (cr m )i/(crxx)i = c 2 , 1 < i < n, and 
constant correlation coefficients, rj = r, 1 < i < n. More specifically, or- 
thogonal (c 2 = 1) and oblique regression were analysed in the earlier (Ial90) 
and in the latter (FB92) paper, respectively. The (dimensionless) squared 
weighted residuals can be defined as in the case of functional models (Y69): 

/ 5 ^2 w xX X i - x if + w vi( Y i ~ Vif - 2r iy /w Xi w yi (Xi - - Vi ) 

(Ri) = , 2 ; (9a) 

^ ' i 

^ xy)t • 'r,| < 1 ; l<i<n ■ (9b) 



[{Oxx)Myy)i] 1/2 



where w Xi , w yv are the weights of the various measurements (or observations) 
and Ti the correlation coefficients. The terms, w Xi (Xi — Xi) 2 , w yi {Yi — yi) 2 , 
fi, 1 < i < n, are dimensionless by definition. An equivalent formulation 
in matrix formalism can be found in specific textbooks, where weighted true 
residuals are conceived as (dimensionless) "statistical distances" from data 
points to related points on the regression line [e.g., F87, Chap. 1, §1.3.3, 
Eq. (1.3.16)]. 

Accordingly, the least-squares regression estimator can be expressed as in 
the case of functional models (Cll) where the weights, w Xi , w yi , are related 
to intrinsic scatter. Then the regression line slope and intercept estimators 
take the same formal expression with respect to their counterparts related 
to functional models, while (in general) the contrary holds for regression line 
slope and intercept variance estimators. 

Classical results on extreme structural models (Ial90; FB92) are restricted 
to oblique regression for homoscedastic data with constant correlation coeffi- 
cients (w Xi = w x , w Vi = Wy, r,i = r, 1 < i < n). In the following subsections, 
the above mentioned results extended to heteroscedastic data shall be ex- 
pressed in terms of weighted deviation (matrix) traces (Cll): 



Qp^^Qii^^n^Xi-XYiYi-Yy ; (10) 
1=1 

n 

Qoo = J2Qi(w Xi ,w yi ,ri)=nQ ; (11) 



i=l 
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where Q pq are the (weighted) pure (p = and/or q — 0) and mixed (p > 
and g > 0) deviation traces, and X, Y, are weighted means: 

n 

Z = ^ ; Z = X,Y ; (12) 

^^ i +a ^-U ; (i3) 



a = J^ ; l<z<n ; (14) 
V w ^ 

in the limit of homoscedastic data with equal correlation coefficients, w Xi = 
w x , w yi = Wy,ri = r, 1 < i < n, which implies Qi(w Xi ,w yi ,ri) = Q{w x , w y , r) = 
Q, Eqs. (Hn}, ([II]), dH, flU, and flU reduce to: 

Qpq = QSpq ] (15) 

S pq = Y,{X l -XY{Y t -Yy ; (16) 

i=l 

Qoo = QSqo ; (17) 

S'oo = n ; (18) 

Z = Z ; Z = X,F ; (19) 

^■ = w ' = TT^fes ; 1£ ' £ni (20) 

/ w 

a i = n = ; 1 < i < n ■ (21) 

V w x 

where S pq are the (unweighted) pure (p = and/or g = 0) and mixed (p > 
and g > 0) deviation traces. 

Turning to the general case and using the least-squares estimator, T~ = 

X)r=i(-^i) 2 > yields for regression line slope and intercept estimators the same 
expression with respect to functional models (Cll). Accordingly, regression 
line slope and intercept estimators may be conceived similarly to state func- 
tions in thermodynamics: for an assigned true point, P* = (x*,y*), what is 
relevant is the related observed point, Pj = (Xi,Yi), regardless of the path 
followed via instrumental and/or intrinsic scatter. More specifically, the re- 
gression line intercept estimator obeys the equation (e.g., Y69; Cll): 

b = Y-aX ; (22) 
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which implies the "barycentre" of the data, P = (X, Y), lies on the estimated 
regression line, related to Eq. OH]), and the regression line slope estimator 
is one among three real solutions of a pseudo cubic equation or two real 
solutions of a pseudo quadratic equation, where the coefficients are weakly 
dependent on the unknown slope. For further details refer to earlier attempts 
(Y66; Y69; Cll). The above mentioned equations have the same formal 
expression for functional and structural models, which also holds for the 
regression line slope and intercept estimators. 

The regression line slope and intercept variance estimators for functional 
models, calculated using the method of partial differentiation (e.g., Y69) 
and the method of moments estimators [e.g., F87, Chap. 1, §1.3.2, Eq. (1.3.7) 
therein] yield, in general, different results (Cll). The same is expected to 
hold, a fortiori, for structural models, for which the method of moments esti- 
mators and the ^-method have been exploited in classical investigations (e.g., 
Ial90; FB92). Accordingly, related results shall be considered and expressed 
in terms of unweighted deviation traces for homoscedastic data with equal 
correlation coefficients and extended in terms of weighted deviation traces 
for heteroscedastic data, with regard to a number of special cases considered 
in earlier attempts in the limit of uncorrelated errors in X and in Y (Ial90; 
FB92). With this restriction, the pseudo cubic equation reduces to: 

V 20 a 3 - 2V n a 2 - (W 20 ~ V 02 )a + W n = ; (23) 

where the deviation traces are defined by Eq. ffTOj) . via Eq. f JT3|) and Vi = 
Wf /w Xi . For further details refer to the parent paper (Y66) and to a recent 
attempt (Cll). A formulation of Euclidean and statistical squared residual 
sum for homoscedastic and heteroscedastic data is expressed in Appendix [Al 



2.3 Errors in X negligible with respect to errors in Y 

In the limit of errors in X negligible with respect to errors in Y, a 2 (a xx )i <C 
((Tyy)i, a(a xy )i < ((T yy )i, 1 < i < n. Ideally, {a xx )i ->■ 0, {(J xy )i -» 0, 
1 < i < n, which implies rj — > 0, w Xi — > +oo, Qi — > 0, Wj — > w Vi , 1 < i < n. 
Accordingly, the errors in X and in Y are uncorrelated. 

For homoscedastic data, w Xi = w x , w Vi = w y , 1 < i < n, the regression 
line slope and intercept estimators are (Ial90; Cll): 

a Y = fi ; (24) 

by = Y — a Y X ; (25) 
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where the index, Y, stands for OLS(Y|X) i.e. ordinary least square regression 
or, in general, WLS(Y|X) i.e. weighted least square regression of the depen- 
dent variable, Y, against the independent variable, X (Ial90). Accordingly, 
related models shall be quoted as Y models. 

The regression line slope and intercept variance estimators, in the special 
case of normal residuals may be calculated using different methods and/or 
models [e.g., F87, Chap. 1, §1.3.2, Eq. (1.3.7) therein; FB92; Cll]. The result 
is: 



a Y )N\ 



(a Y ) 2 

n-2 

(ay) 2 

n-2 



(n - 2)R Y 



dx — CLy 



+ 0(Sy, ay, dx) 



ay 



+ 6(dY, ay, dx) 



K*sJ 



lay Dqo 



ay Su 



a Y jNj 



(26) 

e(d Y ,d Y ,d x ) ; (27) 

n - 2 Soo 

where the index, N, denotes normal residuals, R is defined in Appendix lAl 
and d x = Sc^/Sn- The funcion, 0(d Y ,dY,dx), is a special case of a more 
general function, Q(a c , ay, d x ) which, in turn, depends on the method and/or 
model used. For further details refer to Appendix [BJ 

The regression line slope and intercept variance estimators, in the general 
case of non normal residuals may be calculated using the 5-method (Ial90). 
The result is: 



S22 + (ay) 2 S4o 



2ayS : 



31 



(S20) 2 

ay dx — ay S\\ 



n 

S\2 



ay Soo 
(dy) 2 Sso — 2ayS2i 



&a v ) Ao"? - ; 

«Y/ b Y a Y ' 



(28) 
(29) 
(30) 



20 



where Eqs. fT28|) - fl30|) are equivalent to their counterparts expressed in the 
parent paper (Ial90). 

The application of the 5-method provides asymptotic formulae which un- 
derstimate the true regression coefficient uncertainty in samples with low 
(n ~ 50) or weakly correlated population (FB92). In the special case of nor- 
mal and data-independent residuals, 0(dY, ay, dx) — > 0, Eqs. (T28|) . (f29|) . must 
necessarily reduce to ff26l) . (I27j) . respectively, which implies an additional fac- 
tor, njin — 2), in the first term on the right-hand side of Eqs. fl28p - (|3H|) . For 
further details refer to Appendix O 

The expression of the regression line slope and intercept estimators and 
related variance estimators for normal residuals, Eqs. (T2~21) . fT25l) . fT26|) . ([27j), 
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coincide with their counterparts determined for Y models in classical and 
recent attempts [e.g., FB92, Eq. (4) therein in the limit c 2 = a yy /a xx — > +00; 
Lavagnini and Magno, 2007, Eqs. (3)-(7) therein]. 

For heteroscedastic data, the regression line slope and intercept estimators 
are (Cll): 



(w 



ay 
by 



(Wy J 20 

Y -a Y X 



(31) 
(32) 



where the weighted means, X and Y, are defined by Eqs. (TT2]) - ff!4|) . 

For functional models, regression line slope and intercept variance estima- 
tors in the general case of heteroscedastic data reduce to their counterparts 
in the special case of homoscedastic data, as {&a Y [(w y ) pq ]} 2 — > l a a Y (w y S pq )] 2 , 

{^b Y [C"^)pg]} 2 [h Y ( w y S pq)} 2 ^ via E 1- (M) where Qi = (w y )i = w y , 1 < i < 
n. For further details refer to an earlier attempt (Cll). 

Under the assumption that the same holds for extreme structural models, 
Eqs. (j2"fi]) - (j3"0"|) take the general expression: 



a Y m\ 



(a Y f 



n 



lay 



n — 2 Ry (w 



y)oo 



n 



ay [Wy )u 



+ Q(ay, a-y, a' x ) 



n-2 



w 



ay 



+ Q(a Y ,ay,a x ) 



vm 



_a Y (w y ) 00 



[(*. 



ay 



a Y jNj 



n 



Wy 11 . ./ s 

<d(ay,ay,a x ) 



2(Wy) 



00 



{w y ) 00 (w y ) 2 2 + (a Y ) 2 (w y ) 4 o - 2a Y (w y ) 3 i 



n 



[(w 



a Y a x - a Y (w y )n 



by I 



n 



ay 



(Wyko 



+ (*) 2 (* 



ay j 



-Xa 



n 



1 12 + {ay) 2 {w y ) 30 ~ 2a Y {w y ) 2 i 



where a' v 



(33) 
(34) 
(35) 
(36) 
(37) 



terms of n(w 



w y)o2/ {w y )u, R is defined in Appendix IA1 and is expressed in 
y)pq/(wy) 00 instead of S pq . 
In the special case of normal and data- independent residuals, Q(ay, ay, a' x ) — > 
0, Eqs. (1331) . (1331) . must necessarily reduce to (133]) . (134"]) . respectively, which 
implies an additional factor, n/(n — 2), in the first term on the right-hand 
side of Eqs. (133|) -(137j). 

In absence of a rigorous proof, Eqs. (1331) - (l37l) must be considered as ap- 
proximate results. 
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2.4 Errors in Y negligible with respect to errors in X 

In the limit of errors in Y negligible with respect to errors in X, {u yy )i <C 



[cr X y)i < a(a x 



1 < i < n. Ideally, (cr yy )i — > 0, (a. 



xy ) i 



0, 1 < 

i < n, which implies — )■ 0, w yi — > +oo, Qi —> +oo, Wi — > w Xi , 1 < i < n. 
Accordingly, the errors in X and in Y are uncorrected. As outlined in an 
earlier paper (Cll), the model under discussion can be related to the inverse 
regression, which has a large associate literature (e.g., Miller, 1966; Garden 
et al., 1980; Osborne, 1991; Brown, 1993; Lavagnini and Magno, 2007). 



For homoscedastic data, w Xi = w x , w 



'y< 



w y , 1 < i < n, the regression 



line slope and intercept estimators are (Ial90; Cll): 



a x 



So2 

Sn 

b x = Y-a x X ; 



(38) 
(39) 



where the index, X, stands for OLS(X|Y) i.e. ordinary least square regression 
or, in general, WLS(X|Y) i.e. weighted least square regression of the depen- 
dent variable, X, against the independent variable, Y (Ial90). Accordingly, 
related models shall be quoted as X models. 

The regression line slope and intercept variance estimators, in the special 
case of normal residuals may be calculated using different methods and/or 
models [e.g., F87, Chap. 1, §1.3.2, Eq. (1.3.7) therein; FB92; Cll]. The result 
is: 



axjNj 



{ax) 
n-2 

(ax) 2 
n-2 



(n - 2)R y 

CLxSu 

Ox — Q>Y 



+ 6(a x , a Y ,a x ) 



ay 



^ + (xr- 

ax ooo 



+ e(a x ,a Y ,a x 

2 a x Sn 



n 



2S t 



00 



(40) 

9(a x ,a Y ,a x ) ; (41) 



where the index, N, denotes normal residuals, R is defined in Appendix [Al 
and a Y = Sn/S^o- The funcion, 6(a x ,aY,S x ), is a special case of a more 
general function, 0(cic, ay, a X ) which, in turn, depends on the method and/or 
model used. For further details refer to Appendix [Bl 

The regression line slope and intercept variance estimators, in the general 
case of non normal residuals may be calculated using the 5-method (Ial90). 
The result is: 



[cr, 



ax/ 



S04 + (ax) 2 5 , 22 — 2d X S'i3 



(42) 



12 



[art 



O'X — ay Sn 



11 



dy 5*00 
\2 i 



ax, 



- 2 -Xa. 

n 



bx&x ' 



a 



^03 + (o_x) S21 ~ 2dxS'i2 



(43) 
(44) 



where Eqs. fl42|) -( l44j) are equivalent to their counterparts expressed in the 
parent paper (Ial90). 

The application of the ^-method provides asymptotic formulae which un- 
derstimate the true regression coefficient uncertainty in samples with low 
(n ~ 50) or weakly correlated population (FB92). In the special case of nor- 
mal and data-independent residuals, 0(dx, ay, ax) — > 0, Eqs. fj4~2]) . fT43|) . must 
necessarily reduce to PHI) . (T4"T1) . respectively, which implies an additional fac- 
tor, n/(n — 2), in the first term on the right-hand side of Eqs. P5]) - fT44|) . For 
further details refer to Appendix O 

For heteroscedastic data, the regression line slope and intercept estimators 
are (Cll): 



(45) 



b x = Y-a x X ; (46) 

where the weighted means, X and Y, are defined by Eqs. (I12p - (fi4"]) . 

For functional models, regression line slope and intercept variance estima- 
tors in the general case of heteroscedastic data reduce to their counterparts 



W x S pq )] , 
W T , I < i < 



"s. 



in the special case of homoscedastic data, as {cra x [(w x ) pq \} — > [a, 

{^b x [(^)p<?]} 2 -> [^b x ( w ^ S PQ)} 2 ^ via E 1- CGI where Q { = (w x )i 
n. For further details refer to an earlier attempt (Cll). 

Under the assumption that the same holds for extreme structural models, 
Eqs. (I40p - fr44l) take the general expression: 



a x jNj 



n- 2 

(ax) 2 
n-2 



n-2R x (w x )oo 



ax (Wx)u 



0(a x ,a Y ,ax) 



ly 



0(dx,d Y ,dx) 



K*6 



[On 



1 (w x ) n 
ax (wx)oo 



ax [w x )n 



a x jNj 



n-2 (w x )oo 
{wx)oo {wx)o4 + (ax) 2 (w x ) 2 2 - 2ax(w x )i 3 



6(d x ,d Y ,ax) 



{w x hi\ 



bxax 



n 

ax ax - a'y (wx)n 
n a' Y (w x )oo 
(wx)o3 + (ax) 2 (w x ) 2 i - 2ax{w x ) 12 
{w x )u 



2 



(47) 
(48) 
(49) 
(50) 
(51) 
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where a' Y = C^) 11/(^)205 R is defined in Appendix |A~1 and 6 is formulated 
in terms of n(m r ) p(? /(m r )oo instead of S pq . 

In the special case of normal and data-independent residuals, 0(ax, a Y , ax) — > 
0, Eqs. (1491 . ( 150|) . must necessarily reduce to (14"T|) . (HHj) . respectively, which 
implies an additional factor, n/(n — 2), in the first term on the right-hand 
side of Eqs. (j4^j)-(|5T|). 

In absence of a rigorous proof, Eqs. fH7l) - fl5Tl) must be considered as ap- 
proximate results. 



2.5 Oblique regression 

In the limit of constant y to x variance ratios and constant correlation 
coefficients, the following relations hold: 

\P^V1j)i 2 ^x; <~> — 2 2 (^xy)i » - . ^ / r- n\ 

- c ; — - = Qi = c ; - — yj - =nc = rc ; 1 < z < n ; (52) 



Wi = ™* - ; 1 < < < n ; = ; (53) 

cr + <r — 2rac [w y ) rs (w x ) rs 

where the weights are assumed to be inversely proportional to related vari- 
ances, w Zi oc l/(a zz )i, z = x,y, as usually done (e.g., FB92). By definition, 
c has the dimensions of a slope, which highly simplifies dimension checks 
throughout equations, and for this reason it has been favoured with respect 
to different choices exploited in earlier attempts (e.g., Y66; F87, Chap. 1, 
§1.3; FB92). 

It is worth noticing that Eq. (152]) holds for both homoscedastic and het- 
eroscedastic data. It can be seen that the lines of adjustment are oriented 
along the same direction (York, 1967) but are perpendicular to the regression 
line only in the special case of orthogonal regression, c 2 = 1 (e.g., Carroll et 
al., 2006, Chap. 3, §3.4.2). Accordingly, the term "oblique regression" has 
been preferred with respect to "generalized orthogonal regression" used in 
an earlier attempt (Cll). 

The variance ratio, c 2 , may be expressed in terms of instrumental and 
intrinsic variance ratios, c|, and Cg, respectively, as: 

c 2 = Ep^lic 2 + Ip^iMcl ; 1 < i < n ; (54a) 

[{0~xx)i\F [{0~xx)i\S 

where Cp = c§ implies c| = Cg = c 2 ; c 2 — >■ c| for functional models, [(cr 22 )j]s <C 
[(c Z z)i]F, z = x, y, 1 < i < n; c 2 — > c| for extreme structural models, 
[(o" zz )i] F < [(crzz)i)s, z = x,y, 1 < i < n. 
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For homoscedastic data, w x . = w x , w y . = w y , 1 < i < n, the regression 
line slope and intercept estimators are (FB92; Cll): 



a c 



S02 — C 2 S20 
2^ 



Q-xQ-Y — c 



2ay 

bc = Y-a c X 



IT 



it 



1 + c 



ITc 2 




1/2) 



\ 2ay 



(55) 
(56) 



where the index, C, denotes oblique regression, ay = Sn/S 2 o; d x = So 2 /Sn; 
and the double sign corresponds to the solutions of a second-degree equation, 
where the parasite solution must be disregarded. Accordingly, related models 
shall be quoted as C models or O models in the special case of orthogonal 
regression (c 2 = 1). For further details refer to an earlier attempt (Cll). 

The regression line slope and intercept variance estimators, in the special 
case of normal residuals may be calculated using different methods and/or 
models [e.g., F87, Chap. 1, §1.3.2, Eq. (1.3.7) therein; FB92; Cll]. The result 
is: 



K^ao)i 



[(*. 



n-2 

n-2 
1 S u 



[n - 2)R C 



+ 6(a c ,a Y ,ax) 



a c 



+ 



fly 



+ 0(a c , ay, d x ) 



(57) 



b C ' 



no 



+ (Xf 



[(o-dc)^] 2 ~ r^7T^-@(dc,«Y,ax) ! (58) 



n-2 S t 



oo 



where R is defined in Appendix [A] and depends on the method and/or 
model used, as shown in Tabled) For a formal demonstration, see Appendix 
[El The extreme situations, ac — > ay, ac — > ax, are directly inferred from 
Table [1] as 6(a Y ,aY,a x ) = 0, 6(ax,ay,ax) = 0, respectively, in all cases. 
More specifically, the former relation rigorously holds while the latter has to 
be restricted to large (n>l, ideally n — > Too) samples when appropriate. 
In general, can be neglected with respect to the remaining terms in the 
asymptotic expressions of Eqs. (1571) and (15B1 . If the residuals are independent 
of the data, B also vanishes regardless of the sample population. For further 
details refer to Appendix [B] and O 

The regression line slope and intercept variance estimators, in the general 
case of non normal residuals, may be calculated using the 5-method (Ial90; 
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Table 1: Explicit expression of the function, 0(dc, d Y , Ox), appearing in the 
slope and intercept variance estimator formula for oblique regression, Eq. (I57p 
and fl58j) . respectively, according to different methods and/or models. Symbol 
captions: A uv = du/d v - 1; (U,V) = (X,C), (C,Y). Method captions: AFD 
- asymptotic formula determination; MME - method of moments estimators; 
LSE - least squares estimation; MPD - method of partial differentiation. 
Model captions: F - functional; S - structural; E - extreme structural. Case 
captions: HM - homoscedastic; HT - heteroscedastic. Reference captions: 
F987 (F87, Chap. 1, §1.3.2); FB92 (Feigelson and Babu, 1992); FB11 (FB92, 
erratum 2011); C011 (Cll). 



0(ac, d Y 


ox) 


method 


model 


case 


source 







AFD 


E 


HM 


FB11 


Axc^cy + 


(^cy) 2 
n— 1 


MME 


E 


HM 


FB92 


AxC^CY + 


(Acy) 2 
n— 1 


MME 


S 


HM 


F987 


AxcAcy 1 / 
n-l ~ V 


Acy) 2 
n-l) 


LSE 


E 


HM 


FB11 


2Axc^cy 


MPD 


F 


HT 


con 



FB92). The result is: 

/i. \2_ (?l ^2 c Va Y ) 2 + (d Y ) 4 (oa x ) 2 + 2(d Y ) 2 c 2 oa YCx 

{Vac — \ a C) 77 \or A /~ \o 2 i T" " 2N21 ' v ' 

[ay) [4(a Y ) c A + (ayax — c z ^ 2] 



,~ ,2 a C 

K c ) = — 



11 



, (^f\2(„ ,2 

dc ay S 

2^, 



«x — clc a-c — ay 
dc ay 



^00 



-- x (K*c + ^ c ) ; ( 6 °) 

^ + d Y dxS3i - (d Y + dx)S22 



5*20 5*11 

d c c 2 5 12 + a Y a c S 30 - (d Y + 00)^21 



hY ' ac d Y [4(d Y ) 2 c 2 + (d Y d x -c 2 ) 2 ] 1 /2 S; 



20 



(61) 
(62) 



d c d Y S03 + dxdp^i ~ (Qx + d c )5*i2 , . 

CTSx " c ~ [4(d Y ) 2 c 2 + (d Y d x - c 2 ) 2 ]V2 ; [M) 

where Eqs. (1591) and (I60I) in the special case, c 2 = 1, are equivalent to their 
counterparts expressed in the parent paper (Ial90) provided absolute values 
appearing therein are removed. For a formal discussion refer to Appendix ID] 
In addition, Eq. ([59]) is equivalent to its counterpart expressed in the parent 
paper (FB92, erratum 2011). 
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The dependence on the variance ratio, c 2 , in Eqs. (J59~]l . f[6"2"j) . f[6"5j) . may be 
eliminated via Eq. ( I142p . The result is: 



(a ac ) 2 = (a c ) 2 

(a c ) 4 (Axc) 2 (aa Y ) 2 + {a Y )\A CY )\a^f + 2{a Y f{a c fA xc A CY ^ 
(a Y ) 2 {4(a Y ) 2 (ac) 2 ^xc^CY + [a Y a x A CY - (a c ) 2 ^xc] 2 } 

= (a c ) 3 Axc 

^ a Y {A(a Y ) 2 (a c ) 2 A xc A CY + [(a Y a x A CY - (a c ) 2 A X c] 2 } 1/2 
5i2 + a Y a c S 30 - (a Y + a c )S 2 i 



x 



a Y ax . 



(64) 



x 



S- 



20 



a c a Y A CY 



^ {4(a Y ) 2 (a c ) 2 A xc A C Y + [(a Y a x A CY - (a c ) 2 A xc ] 2 } 1/2 



x 

A - Su 
ay 



5 



ii 



(U,V) = (X,C),(C,Y),(X,Y) ; 



(65) 

(66) 
(67) 



in terms of slope estimators, variance slope estimators, and deviation traces. 

The application of the 5-method provides asymptotic formulae which un- 
derstimate the true regression coefficient uncertainty in samples with low 
(n ~ 50) or weakly correlated population (FB92). In the special case of 
normal and data-independent residuals, Q(dc,a Y ,a x ) — >■ 0, Eqs. f[59~j) . f[6"U|) . 
must necessarily reduce to fl5T|) . (1551) . respectively, which implies an addi- 
tional factor, n/(n — 2), in the first term on the right-hand side of Eqs. (1281) . 
g2]), and flEOD-fS]). For further details refer to Appendix O 

For heteroscedastic data, the regression line slope and intercept estimators 
are (Cll): 



a c 




i=f 



2 ( {w x )o2 - C 2 (w x ) 2 o\ 2 



1 + c 



2 / CLxQ-y — C 



l/ 2 1 



1/2 • 



2a; 



Y 



Y-a c X 



(68) 



(69) 



where a' Y = (w x )u/(w x )2o', &x = (wx)o2/ '{wx)n] an d the weighted means, X, 
Y, are defined by Eqs. (ffijl -ffTD. 

For functional models, regression line slope and intercept variance estima- 
tors in the general case of heteroscedastic data reduce to their counterparts 
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in the special case of homoscedastic data, as {<Ja c [( w x)pq\} 2 —> [^a c ( w xSpq)] 2 , 
{<5& c [(^)pg]} 2 [°b c ( w x S p<i)\ 2 1 via E( l- (HHD where Q; = (w x )i = w x , 1 < % < 
n. For further details refer to an earlier attempt (Cll). 

Under the assumption that the same holds for extreme structural models, 
Eqs. (I57p - (l66l) take the general expression: 



[(oa c )N] z 



(a c ) 2 \n - 2R C (w x ) 00 



n-2 

(ac) 2 
n-2 



n a c {w x )n 



+ 6(a c ,a Y ,a x ) 



a x — o-c a c — a y 



a c 



K^) 



1 (vQ 



ii 



ac {w x )oo 



[fa 



(J. 



(ac) 



.c 4 (aa Y ) 2 + (a Y ) 4 (<3-a x ) 2 + 2(a Y ) 2 c 2 a aYax 



a c jNj 
2 



e(a c ,a Y ,a x ) 

_ ac (w^)n 
n-2 (w x )qq 

2„2 



(a Y ) 2 [4(a Y ) 2 c 2 + (a Y a x - C 2)2] 



; (70) 
e(a c ,a Y ,a x ) ; (71) 

; (72) 



(J 



ac 



6c ' 



a x — ac . a c — a Y 



ac 



Y 



K) 



00 



n 



o, 



{w x ) 00 {w x )i 3 + a Y a x (w a; )3i - (a Y + a x ) 



7? 



\W X )20\W X )\\ 



(73) 
(74) 



(7 



acc 



b Y a c 



a Y [4(a Y ) 2 c 2 + (a Y a x -c 2 ) 2 ] 1 /2 

/ «^)i2 + a Y a c (w^)30 - (a Y + ac) (1*^)21 



x 



{w x ) 



20 



(75) 



(7 



a c a Y 



X 



[4(a Y ) 2 c 2 + (a Y a x -c 2 ) 2 ] 1 /2 

Wx) 03 + axa c (wx) 2 i - (a x + a c )(^)i2 
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X 



(ac) 4 (Axc) 2 (aa Y ) 2 + {a Y )\A CY ,)\a^Y + 2(a Y ) 2 (a c ) 2 A xc A CY ^a 



ay ax , 



a 



(a Y ) 2 {4(a Y ) 2 (a c ) 2 A xc A CY ' + [a Y a x A CY ' - (a c ) 2 A xc ] 2 } 

(a c ) 3 A xc 



(76) 



(77) 



b Y ac 



i>x«c 



a Y {4(a Y ) 2 (ac) 2 A xc A CY ' + [(a Y a x A CY > ~ (a c ) 2 ^xc] 2 } 1/2 
(w^)i2 + aYac(^)30 ~ (ay + acX^Xi 

{Wx)20 

aca Y Ac Y ' 

{4(a Y ) 2 (a c ) 2 A xc A CY ' + [(a Y a x A CY , - (a c ) 2 A xc } 2 }^ 



{7i 
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X 



(w x ) 03 + a x a c (w x ) 21 - {a x + a c )(w x ) 



12 



(79) 



w x )u 



and, in addition 




(81) 



(80) 



where a Y = (w y ) 11 /(w y ) 20 ; a Y = (w x ) u /(w x ) 20 ; a x = (w x ) 02 /(w x ) u ; A CY > = 
^c/o-y — 1; R is defined in Appendix lAl and 9 is formulated in terms of 
n(w x ) pq /{w x ) m instead of S pq . 

In the special case of normal and data-independent residuals, Q(ac, ay, ax) —> 
0, Eqs. (1721) . (1731) . must necessarily reduce to (170|) . (17T|) . respectively, which 
implies an additional factor, n/(n — 2), in the first term on the right-hand 
side of Eqs. (135]) . (1391. and (j7g]) - (T75|) . 

In absence of a rigorous proof, Eqs. (|7D1) - (|7dT) must be considered as ap- 
proximate results. 

2.6 Reduced major-axis regression 

The reduced major-axis regression may be considered as a special case 
of oblique regression, where c 2 = a x a Y - Accordingly, Eqs. (152]) and ( 153]) also 



For homoscedastic data, w Xi = w x , w yi = w y , 1 < i < n, the regression 
line slope and intercept estimators, via Eqs. (157)]) and ( I5T)]) are: 



where the index, R, denotes reduced major-axis regression, a Y = Sn/S 2 o; 
ax = So 2 / Sii, and the double sign corresponds to the solutions of the square 
root, where the parasite solution must be disregarded. Accordingly, related 
models shall be quoted as R models. For further details refer to an earlier 
attempt (Cll). 

The regression line slope and intercept variance estimators may be di- 
rectly inferred from Eqs. (157]) . ( 158]) . for normal residuals, in the limit, clq — > 
Or = ^/axa Y . The result is: 



hold. 




(82) 



b R = Y - a R X 



(83) 




(a R ) 2 \(n-2)R R 



+ 0(a R ,a Y ,ax) 



n - 2 a R Sn 



19 



a x — Or Or — Oy A A N 

— ; 1 ; h B(a R ,a Y ,a x ) 



[(ft 



SrMJ 



(g R ) 2 

n — 2 I a R 

«R >->oo 



Oy 



(84) 



[(ft 



SrJnJ 



a R S 
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6(a R ,a Y ,ax) ; (85) 



n-2S Q0 

and for non normal residuals the application of the 5-method yields (Ial90): 



UTS 



(or) 1 

or 
n 



1 (^ Y ) 2 1 (q-ax) 2 1 ftoYOx 

4 (a Y ) 2 4 (a x ) 2 2 a Y a x 



(86) 



«x - «r + «r - «Y 



or 



o Y 



+ P0 2 «) 2 

^00 



1/2 



57) 



a 



f>Y»R 



(T. 



1 / ax\ S 12 + ayaR^o - (ay + a^jSgi 

1 / ay\ 5q3 + axaR^i - (a x + Qr)^ 

2 \a x y S u 



39) 



where o"a Y a x is defined by Eq. ( loTj) and Eqs. (1551) . (1571) . are equivalent to their 
counterparts expressed in the parent paper (Ial90). For further details refer 
to Appendix [Dl 

The extension of the above results to heteroscedastic data via Eqs. (182|) - 
(1891) reads: 



a R = =F, 



(W;cJo2 



\| (w x ) 20 
bn = Y - a R X : 



=FV a xaY 



[(ft 



a R jN 



(a R ) 2 [71-2^(^)00 . ./ . x 

+ B(a R , a Y ,a x ) 



n-2 
(«r) 2 



?7 



n a R (WJn 



o x — a R a R — a Y A/ „ 

— ; 1 17 h B(a R , a Y ,a x ) 



or 



Y 



[(ft 



1 K) 



11 



+ (*) s 



) 2 = («r) 2 ^ 



Hft* 



17- \ 12 a R l^xjll 
n-2 {w x ) 00 

1 (ftax) 1 ftayax 



UT 



Or 



6 R ' 



n 



4 (a Y ) 2 4 (a x ) 2 2 a Y a x 
^)n 



(90) 
(91) 

5 (92) 
6(a R ,a Y ,a x ) ; (93) 
(94) 



a-x — O-R clr — Cty 



«R 



Y 



^(ftfeyaR + h^a, • ■ 



(95) 



20 



by an 



If ax 
2 \a Y 

1 



1/2 



K)i2 + a Y a K (w x ) 30 - (a Y + d R )(w x ) 2 i 



(W x )20 



a Y 



1/2 



and, in addition: 



w x )q3 + dxaR{w x ) 2 i ~ («x + Qr)(w x )i2 

(Wx)ll 



4 



-(^) 2 + -(-a x ) 2 + 2a, 



ay ax 



(Wx)ll/(Wx)20; OX 
\2 



(96) 
(97) 



(9* 



(W;r)02/(W; 



i Hi 



(t„i )-, are expressed by Eqs. (1741 . (T8TT) . 



where a Y = («^)n/(«^)2o; a Y = 
is defined in Appendix [Aj cxa Y a x > 
respectively, and 9 is formulated in terms of n(vT x ) pq / (mz)oo instead of S pq . 

In absence of a rigorous proof, Eqs. (1521 - (1971) must be considered as ap 
proximate results. 



R 



2.7 Bisector regression 

The bisector regression implies use of both Y and X models for deter- 
mining the angle formed by related regression lines. The bisecting line is 
assumed to be the estimated regression line of the model. 

Let «y, &x, ckb, be the angles formed between Y, X, B, regression line, 
respectively, and x axis, and 7 the angle formed between Y and X regression 
lines, as outlined in Fig.[TJ Accordingly, 7/2 is the angle formed between Y 
or X and B regression lines. 

The following relation can easily be deduced from Fig.HJ ax = «y + 7; 
q?b = «y + 7/2 = («y + «x)/2, and the dimensionless slope of the regression 
line is tanctB- Using the trigonometric formulae: 

tanw + tanf u smu 

tan(w + v) = ; tan ■ 



1 — tan u tan v ' 2 1 + cos u ' 

and the identity: 

X{1 + Sy) + Y(l + Sx) _ XY - 1 + SxS Y 
(1 + S x )(l + S Y ) - XY ~ X + Y ' 



x= ax . y= OY . Sx = VTT ^ . Sy = VT+Y^ ; 
the regression line slope estimator, after some algebra, is expressed as (Ial90): 



d Y dx -al + \ a 2 u + (a Y ) 2 \/ a « + (ax) 2 



a B = v " I v ; (99) 

«y + ax 
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Figure 1: Regression lines related to Y, X, and B models, for an assigned 
sample. By definition, the B line bisects the angle, 7, formed between Y and 
X lines. The angle, «b, formed between B line and x axis, is the arithmetic 
mean of the angles, ay and ax, formed between Y line and x axis and between 
X line and x axis, respectively. 



where a u is the unit slope, a Y = S11/S20; «x = S02/S11; and the regression 
line intercept estimator reads (Ial90): 



where the index, B, denotes bisector regression. 

The bisector regression may be considered as a special case of oblique 
regression where the variance ratio, c 2 , is deduced from the combination of 
Eqs. (j55|) and requiring ac = clb- After a lot of algebra involving the 
roots of a second-degree equation, the result is: 



6b = y — cibX 



(100) 
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where the parasite solution must be disregarded. Accordingly, Eqs. (1521 and 
(PD also hold. 

For normal residuals and homoscedastic data, the regression line slope 
and intercept variance estimators may be directly inferred from Eqs. ([5] 
and (158|) in the limit, clq — > ae- The result is: 



a B jNj 



(QbJ 
n-2 

(a B ) 2 
n - 2 



(n - 2)i? F 



+ 6(a B ,a Y ,ax) 



a B Su 



a B 



ff 1 + ro 2 



+ 



ay 



+ 0(a B ,«Y,a x 



a B jNj 



; (102) 
0(a B ,av,ax) ; (103) 



n - 2 5*00 

and for non normal residuals the application of the 5-method yields (Ial90): 



[a, 



[CTt 



(a Y + d x ) 2 
n 



< + («x) 2 



On 



i2 V a Y 



H — — ( ffav ) +2(7, 



,2 ^ a x, 



«Y«X 



;(104) 



«x — Ob _|_ Ob — Qy 



a B 



ay 



2 



£7 



<->00 



a B\Ja u + (ax) S12 + d Y a B S , 30 - («y + ob)^ 
(a Y + ax) + («y) 2 520 

a B\/ a2 + ( a v) 2 ^ + axflB^i - (ax + ae)^ 



(a Y + ax)V< + (a x 



(105) 
(106) 

(107) 



where <3"a Y a x i s defined by Eq. (l6Tj) and Eqs. f)104p . (11051) . are equivalent to 
their counterparts expressed in the parent paper (Ial90). For further details 
refer to Appendix [D] 

For heteroscedastic data, the combination of Eqs. (16"5i) and (19"9~l) . requiring 
&c = clb, after a lot of algebra involving the roots of a second-degree equation, 
yields: 



U*B 



2 <^X ~~F f^B / &B ~F 



«B 



(10* 



where dx = {wx)o2l \w x )n] a' Y = (^1)11/(^)20; and the parasite solution 
must be disregarded. Accordingly, Eqs. fl52|) and fl53l) also hold. 
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The extension of the above results to heteroscedastic data via Eqs. (J9"9"]l - 
ffTUUD and dTO-dMI) reads: 
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n - 2 i? B (w^ 
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n 
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6(a B ,a Y ,a x ); (111) 
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6xaB (a Y + a x )sjal + (a x ) 2 
and, in addition: 



;(115) 
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where ay 



a B 



a Y + a x ) 



«u + (ax) 2 r , 2 | a^ + K) 2 

,2 i i^^^W + 2 i ^ \2V^x 



< + (ay J 



< + (°xj 



avw )" + 2<j, 



;(116) 



(w 3/ )n/(wj / ) 2 o; a^ 
is defined in Appendix [AJ a, 



Y 



^31)11/(^)20; a x = (w x ) 02 / (w x ) n ; R 
1 are expressed by Eqs. (1711) . (15T|) . 



respectively, and 9 is formulated in terms of n{w x ) pq / '(uQoo instead of S pq . 

In absence of a rigorous proof, Eqs. flllll) - flll5p must be considered as 
approximate results. 



2.8 Extension to structural models 

A nontrivial question is to what extent the above results, valid for ex- 
treme structural models, can be extended to generic structural models. In 
general, assumptions related to generic structural models are different from 
their counterparts related to extreme structural models (e.g., Buonaccorsi, 
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2006; 2010, Chap. 6, §6.4.5) but, on the other hand, they could coincide for 
a special subclass. 

In any case, whatever different assumptions and models can be made with 
regard to generic and extreme structural models, results from the former are 
expected to tend to their counterparts from the latter when the instrumental 
scatter is negligible with respect to the intrinsic scatter. It is worth noticing 
that most work on linear regression by astronomers involves the situation 
where both intrinsic scatter and heteroscedastic data are present (e.g., AB96; 
Tremaine et al., 2002; Kelly, 2007). 

A special subclass of structural models with normal residuals can be de- 
fined where, for a selected regression estimator, the regression line slope and 
intercept variance estimators are independent of the amount of instrumental 
and intrinsic scatter, including the limit of null intrinsic scatter (functional 
models) and null instrumental scatter (extreme structural models). More 
specifically, the dependence occurs only via the total (instrumental + in- 
trinsic) scatter. In this view, the whole subclass of structural models under 
consideration could be related to functional modelling (Carroll et al., 2006, 
Chap. 2, §2.1). For further details refer to the parent paper (Cll). 

3 An example of astronomical application 

An astronomical application performed in an earlier attempt (Cll) with 
regard to functional models, shall be repeated here for extreme structural 
models. Related samples will be left unchanged but with the assumption of 
instrumental scatter negligible with respect to intrinsic scatter, as appropri- 
ate in the case under discussion. 

More specifically, the following samples related to the [0/H]-[Fe/H] re- 
lation shall be considered: RB09 (Rich and Boesgaard, 2009), n = 49, 
heteroscedastic data; Fal09 (Fabbian et al., 2009), n = 44, homoscedastic 
data with three different [O/H] determinations, namely LTE (standard local 
thermodynamical equilibrium for one-dimensional hydrostatic model atmo- 
spheres), SH0 (three-dimensional hydrostatic model atmospheres in absence 
of LTE with no account taken of the inelastic collisions via neutral H atoms, 
Sfi = 0), SHI (three-dimensional hydrostatic model atmospheres in absence 
of LTE with due account taken of the inelastic collisions via neutral H atoms, 
Sfi = 1); Sal09 (Schmidt et al., 2009), n = 63, heteroscedastic data. For fur- 
ther details refer to the parent paper (Caimmi, 2010). 

The [0/H]-[Fe/H] empirical relations are interpolated using the regression 
models, G, Y, X, O, R, B, for heteroscedastic data (FB09 and Sal09 samples) 
and Y, X, O, R, B, for homoscedastic data (Fal09 sample, cases LTE, SH0, 
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SHI) and heteroscedastic data where intrinsic scatters are taken equal to 
related typical values, <j [Fe/H] = 0.15, cr[o/H] = 0.15, for both FB09 and Sal09 
samples. Model G relates to a general case where the slope and intercept 
estimators are determined via Eqs. (12 3 p and ( 122|) . respectively. For further 
details refer to the parent papers (Y66; Y69; Cll). Slope and intercept 
estimators together with related dispersion estimators are listed in Tables [21 
[31 and [U El for heteroscedastic and homoscedastic data, respectively. 

Owing to high difficulties intrinsic to the determination of slope and in- 
tercept dispersion estimators for G models, related calculations were not per- 
formed, leaving only approximate expressions (Y66) and asymptotic formulae 
[Appendix [HI Eq. (11601) related to G models] . For the remaining models, 
the regression line slope and intercept estimators and related dispersion esti- 
mators are calculated using Eqs. fl2"lj) - fl3"Uj) and fl3"T]) - fl3"7]) . case Y, homoscedas- 
tic and heteroscedastic data, respectively; Eqs. (I38i) - fpf4l) and (|45j) - (l5Tl) . case 
X, homoscedastic and heteroscedastic data, respectively; Eqs. (|55l) - (l63j) and 
( 168|) - (!76|) . c 2 = 1, case O, homoscedastic and heteroscedastic data, respec- 
tively; Eqs. (152"j) - (l59"j) and (19"Uj) - (l9"7T) . case R, homoscedastic and heteroscedas- 
tic data, respectively; Eqs. (I9"9~i) - (I107I) and (I109p - (I115I) . case B, homoscedastic 
and heteroscedastic data, respectively. 

The regression lines determined by use of the above mentioned methods 
are plotted in Figs. [2] and [3] for heteroscedastic and homoscedastic data, re- 
spectively, where sample denomination and population are indicated on each 
panel together with model captions. Homoscedastic data are conceived as 
a special case of heteroscedastic data in Fig. [2] to test the computer code, 
which is different for heteroscedastic and homoscedastic data. It can be seen 
that lower panels of Figs. [2] and [3] coincide, and the regression lines related 
to models G and O in lower panels of Figs. [2] also coincide, as expected. The 
whole set of regression lines for all methods and all samples is shown in the 
upper right panel of Figs. [2] and [3j 

Regression line slope and intercept estimators have the same expression 
for both structural and funcional models. Accordingly, Figs. [2] and [3] maintain 
unchanged with respect to their counteparts shown in an earlier attempt 
(Cll) where, on the other hand, B models were not included. 

An inspection of Tables [2H and Figs. [2H3] discloses the following. 

(1) Either of the inequalities (Ial90): 

ay < ao < o,r < a-Q < ax ; (Xb < cl u ; 5ii > ; (117a) 

CLy < Sb < CLR < O'O < ] O-B > o u ; S\i > ; (117b) 

where a u is the unit slope, holds for homoscedastic data but the con- 
trary holds for heteroscedastic data. In particular, cl-q < cir < a u for 
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Table 2: Regression line slope estimators, a, and related dispersion esti- 
mators, &a, for heteroscedastic models, G, Y, X, O, R, B, applied to the 
[0/H]-[Fe/H] empirical relation deduced from the following samples (from 
up to down): RB09, Sal09. Dispersion column captions: ENNR - extreme 
structural models with non normal residuals (Ial90); ENRR - extreme struc- 
tural models with normal residuals (FB92); FNRR - functional models with 
normal residuals (Y66; Y69; Cll); YANR - approximate formula for normal 
residuals (Y66; Y69; Cll); AFNR - asymptotic formula for normal residuals 
[Appendix [Bj Eq. (I160p related to the appropriate model] . For G models, 
slope and intercept estimators were not evaluated in the present attempt. 
For Y models and normal residuals, different slope dispersion estimators 
yield coinciding values, as expected. 



m 


a 


ENNR 


ENRR 


0~a 

FNRR 


YANR 


AFNR 


sample 


G 


0.7279 








0.0294 


0.0288 


RB09 


Y 


0.6714 


0.0302 


0.0314 


0.0314 


0.0314 


0.0314 




X 


0.7305 


0.0271 


0.0290 


0.0290 


0.0279 


0.0290 







0.6964 


0.0277 


0.0276 


0.0278 


0.0271 


0.0274 




R 


0.7050 


0.0254 


0.0280 


0.0282 


0.0272 


0.0277 




B 


0.7005 


0.0254 


0.0278 


0.0280 


0.0271 


0.0275 




G 


0.6383 








0.0435 


0.0582 


Sal09 


Y 


0.6167 


0.0810 


0.0398 


0.0398 


0.0398 


0.0398 




X 


0.8652 


0.0772 


0.0833 


0.0829 


0.0664 


0.0829 







0.6355 


0.0753 


0.0609 


0.0637 


0.0541 


0.0580 




R 


0.6927 


0.0677 


0.0664 


0.0700 


0.0560 


0.0626 




B 


0.7336 


0.0662 


0.0704 


0.0738 


0.0579 


0.0666 
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Table 3: Regression line intercept estimators, b, and related dispersion es- 
timators, <7g, for homoscedastic models, G, Y, X, O, R, B, applied to the 
[0/H]-[Fe/H] empirical relation deduced from the following samples (from 
up to down): RB09, Sal09. Dispersion column captions: ENNR - extreme 
structural models with non normal residuals (Ial90); ENRR - extreme struc- 
tural models with normal residuals (FB92); FNRR - functional models with 
normal residuals (Y66; Y69; Cll); YANR - approximate formula for normal 
residuals (Y66; Y69; Cll); AFNR - asymptotic formula for normal residuals 
[Appendix |B| Eq. ( I160p related to the appropriate model]. For G models, 
slope and intercept estimators were not evaluated in the present attempt. 
For Y models and normal residuals, different intercept dispersion estimators 
yield coinciding values, as expected. 



m 


b 


ENNR 


ENRR 


FNRR 


YANR 


AFNR 


sample 


G 


+0.0043 








0.0672 


0.0660 


RB09 


Y 


-0.1121 


0.0608 


0.0675 


0.0675 


0.0675 


0.0675 




X 


+0.0316 


0.0636 


0.0736 


0.0735 


0.0712 


0.0735 







-0.0512 


0.0523 


0.0702 


0.0707 


0.0689 


0.0697 




R 


-0.0305 


0.0521 


0.0710 


0.0725 


0.0693 


0.0704 




B 


-0.0413 


0.0522 


0.0706 


0.0711 


0.0691 


0.0700 




G 


+0.0619 








0.0251 


0.0336 


Sal09 


Y 


+0.0439 


0.0105 


0.0198 


0.0198 


0.0198 


0.0198 




X 


+0.3080 


0.0712 


0.0676 


0.0673 


0.0575 


0.0673 







+0.1461 


0.0396 


0.0509 


0.0525 


0.0469 


0.0491 




R 


+0.1864 


0.0436 


0.0547 


0.0549 


0.0485 


0.0524 




B 


+0.2153 


0.0455 


0.0575 


0.0603 


0.0501 


0.0552 
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Table 4: Regression line slope estimators, a, and related dispersion estima- 
tors, &a, for homoscedastic models, Y, X, O, R, B, applied to the [O/H]- 
[Fe/H] empirical relation deduced from the following samples (from up to 
down): RB09, Sal09, Fal09, cases LTE, SHO, SHI. Dispersion column cap- 
tions: ENNR - extreme structural models with non normal residuals (Ial90); 
ENRR - extreme structural models with normal residuals (FB92); FNRR - 
functional models with normal residuals (Y66; Y69; Cll); YANR - approx- 
imate formula for normal residuals (Y66; Y69; Cll); AFNR - asymptotic 
formula for normal residuals [Appendix [Bj Eq. (I160p related to the appropri- 
ate model]. For Y models and normal residuals, different slope dispersion 
estimators yield coinciding values, as expected. 



m 


a 


ENNR 


ENRR 


FNRR 


YANR 


AFNR 


sample 


Y 


0.6917 


0.0268 


0.0317 


0.0317 


0. 


.0317 


0. 


,0317 


RB09 


X 


0.7600 


0.0326 


0.0349 


0.0348 


0. 


.0332 


0. 


,0348 







0.7143 


0.0282 


0.0327 


0.0331 


0. 


0319 


0. 


,0324 




R 


0.7251 


0.0278 


0.0332 


0.0336 


n 


0321 


n 


0328 




B 


0.7253 


0.0278 


0.0333 


0.0336 


0. 


.0321 


0. 


,0329 




Y 


0.5868 


0.0596 


0.0461 


0.0461 


0. 


,0461 


0. 


,0461 


Sal09 


X 


0.8077 


0.0563 


0.0637 


0.0635 


0. 


.0541 


0. 


,0635 







0.6476 


0.0620 


0.0509 


0.0526 


0. 


.0468 


0. 


.0491 




R 


0.6885 


0.0523 


0.0541 


0.0562 


0. 


.0479 


0. 


.0519 




B 


0.6916 


0.0513 


0.0544 


0.0565 


0. 


.0480 


0. 


.0521 




Y 


0.8961 


0.0333 


0.0303 


0.0303 


0. 


.0303 


0. 


0303 


Fal09 


X 


0.9381 


0.0294 


0.0318 


0.0317 


0, 


.0310 


0. 


,0317 


(LTE) 





0.9150 


0.0319 


0.0310 


0.0311 


0. 


,0305 


0. 


,0308 




R 


0.9168 


0.0310 


0.0310 


0.0312 


0. 


,0305 


0. 


,0308 




B 


0.9169 


0.0310 


0.0310 


0.0312 


0. 


,0305 


0. 


,0308 




Y 


1.2261 


0.0459 


0.0432 


0.0432 


0. 


.0432 


0. 


.0432 


Fal09 


X 


1.2884 


0.0434 


0.0454 


0.0454 


0. 


.0443 


0. 


,0454 


(SHO) 





1.2640 


0.0449 


0.0445 


0.0448 


0. 


.0436 


0. 


,0443 




R 


1.2569 


0.0441 


0.0443 


0.0445 


0. 


,0435 


0. 


,0440 




B 


1.2568 


0.0441 


0.0443 


0.0445 


0. 


,0435 


0. 


,0440 




Y 


1.0492 


0.0358 


0.0341 


0.0341 


0. 


,0341 


0. 


,0341 


Fal09 


X 


1.0946 


0.0315 


0.0356 


0.0356 


0. 


,0348 


0. 


,0356 


(SHI) 





1.0732 


0.0337 


0.0349 


0.0350 


0, 


0343 


0. 


,0347 




R 


1.0716 


0.0332 


0.0348 


§R)350 


0. 


0343 


0. 


,0346 




B 


1.0716 


0.0332 


0.0348 


0.0350 


0. 


.0343 


0. 


.0346 





Table 5: Regression line intercept estimators, b, and related dispersion esti- 
mators, <7g, for homoscedastic models, Y, X, O, R, B, applied to the [O/H]- 
[Fe/H] empirical relation deduced from the following samples (from up to 
down): RB09, Sal09, Fal09, cases LTE, SHO, SHI. Dispersion column cap- 
tions: ENNR - extreme structural models with non normal residuals (Ial90); 
ENRR - extreme structural models with normal residuals (FB92); FNRR - 
functional models with normal residuals (Y66; Y69; Cll); YANR - approx- 
imate formula for normal residuals (Y66; Y69; Cll); AFNR - asymptotic 
formula for normal residuals [Appendix |Bj Eq. (I160p related to the appropri- 
ate model]. For Y models and normal residuals, different intercept dispersion 
estimators yield coinciding values, as expected. 
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0.0759 
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0.0752 
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U.UOOo 


U.UOOo 


U.UOOo 


saiuy 




-1-0 201 1 


0423 


0431 


0430 


0397 


0430 







+0.1212 


0.0268 


0.0357 


0.0363 


0.0343 


0.0351 




R 


+0.1416 


0.0279 


0.0373 


0.0381 


0.0352 


0.0365 




B 


+0.1431 


0.0282 


0.0375 


0.0382 


0.0352 


0.0367 




Y 


+0.5476 


0.0761 


0.0663 


0.0663 


0.0663 


0.0663 


Fal09 


X 


+0.6366 


0.0665 


0.0693 


0.0693 


0.0678 


0.0693 


(LTE) 





+0.5877 


0.0725 


0.0676 


0.0680 


0.0666 


0.0673 




R 


+0.5916 


0.0706 


0.0678 


0.0681 


0.0667 


0.0674 




B 


+0.5916 


0.0706 


0.0678 


0.0681 


0.0667 


0.0674 




Y 


+0.8717 


0.1017 


0.0945 


0.0945 


0.0945 


0.0945 


Fal09 


X 


+1.0037 


0.1003 


0.0992 


0.0991 


0.0968 


0.0991 


(SHO) 





+0.9519 


0.1019 


0.0973 


0.0978 


0.0953 


0.0967 




R 


+0.9369 


0.0998 


0.0967 


0.0973 


0.0950 


0.0961 




B 


+0.9367 


0.0998 


0.0967 


0.0973 


0.0950 


0.0961 




Y 


+0.6518 


0.0808 


0.0745 


0.0745 


0.0745 


0.0745 


Fal09 


X 


+0.7479 


0.0730 


0.0777 


0.0777 


0.0761 


0.0777 


(SHI) 





+0.7027 


0.0772 


0.0762 


410765 


0.0750 


0.0758 




R 


+0.6993 


0.0760 


0.0761 


0.0764 


0.0750 


0.0757 




B 


+0.6993 


0.0760 


0.0761 


0.0764 


0.0749 


0.0757 





RB09 sample, see Table [2J In addition, dy < a G < for heteroscedas- 
tic data, but a counterexample is provided in an earlier attempt (Y66). 

(2) Slope and intercept estimators from O, R and B models are in agreement 

within =1=0". The extension of the above result to slope and intercept 
estimators from Y and X models holds for samples with lower dispersion 
(Fal09). An increasing dispersion yields marginal (RB09) or no (Sal09) 
agreement within =po", for both heteroscedastic and homoscedastic data. 

(3) For normal residuals, slope and intercept dispersion estimators related 

to functional and structural models yield slightly different results, as 
expected from the fact that related asymptotic formulae coincide [Ap- 
pendix [Bl Eq. (I16(jp related to the appropriate model]. Asymptotic 
formulae used in the current attempt make a better fit with respect to 
earlier approximations (Y66; Y69; Cll). 

(4) Systematic variations due to different sample data are dominant with 

respect to the intrinsic scatter. 

In conclusion, regression lines deduced from different sample data represent 
correct (from the standpoint of regression models considered in the current 
attempt) [0/H]-[Fe/H] relations, but no definitive choice can be made until 
systematic errors due to different methods and/or spectral lines in determin- 
ing oxygen abundance, are alleviated. 

4 Discussion 

For an assigned sample, generic structural models belonging to a special 
subclass are indistinguishable from extreme structural models, as outlined in 
an earlier attempt (Cll). Accordingly, the results of the current paper also 
apply to generic structural models of the kind considered. The expression of 
regression line slope and intercept estimators and related variance estimators 
in terms of weighted deviation traces, for heteroscedastic and homoscedastic 
data, makes a second step towards a unified formalism of bivariate least 
squares linear regression. 

Exact expressions of regression line slope and intercept estimators and 
related variance estimators have been rewritten in a more compact form 
with respect to an earlier attempt (FB92) in the limit of oblique regression 
i.e. ((T yy )i/(a xx )i = c 2 , 1 < i < n. It is noteworthy that a constant variance 
ratio, c 2 , for all data points, does not necessarily imply equal variances, 
(& xx )i = a xx = const, (cr y y)i = a yy = const, 1 < i < n. While regression 
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line slope and intercept estimators attain a coinciding expression in different 
attempts (Y66; Y69; Ial90; FB92), the results of the current paper show that 
the contrary holds for related variance estimators. The same holds for both 
reduced major-axis and bisector regression. 

Approximate expressions provided in earlier attempts for normal residuals 
(Y66; Y69) make (at least in computed cases) a lower limit to their exact 
counterparts, as shown in Tables [2HSI YANR vs. ENRR, FNRR. The same 
holds, to a better extent, for the asymptotic expressions determined in the 
current paper, as shown in Tables EE AFNR vs. ENRR, FNRR. Related 
fractional discrepancies for low-dispersion data (RB09, Fal09) do not exceed 
a few percent, which grows up to about 10% in presence of large-dispersion 
data (Sal09). 

It is well known that the regression line slope and intercept estimators are 
biased towards zero for Y models (e.g., F87, Chap. 1, §1.1.1; Carroll et al., 
2006, Chap. 3, §3.2; Kelly, 2007; Buonaccorsi, 2010, Chap. 4, §4.4). Biases 
can be explicitly expressed in the special case of homoscedastic models with 
normal residuals. More specifically, the condition 1 — p 20 ^ 1 ensures bias 
effects are negligible, where p 20 is the reliability ratio: 

P20 = Q !^ 20 v — ; (118) 
^20 + {n- l)a xx 

which implies < p 2 o < 1- For further details refer to specific monographies 
(e.g., F87, Chap. 1, §1.1.1; Carroll et al., 2006, Chap. 3, §3.2.1; Buonaccorsi, 
2010, Chap. 4, §4.4). 

Similarly, it can be seen that regression line slope and intercept variance 
estimators are biased towards infinity for X models. In the special case 
of homoscedastic models with normal residuals, the condition 1 — po 2 *C 1 
ensures bias effects are negligible, where p 02 is the reliability ratio: 

P02 = g 7T ; (119) 

O02 + {n- l)(Tyy 

which implies < P02 < 1 (e.g., Cll). 

Accordingly, slopes are understimated in Y models and overstimated in 
X models by a factor, p 2 o an d I/P02, respectively. For C models (oblique 
regression), O models (orthogonal regression), R models (reduced major-axis 
regression), B models (bisector regression), the regression line slope estima- 
tors lie between their counterparts related to Y and X models, according 
to Eqs. f II 1 8 j) and f 1 1 1 9 [) . which implies bias corrections (e.g., Carroll et al., 
2006, Chap. 3, §3.4.2). Though there is skepticism about an indiscriminate 
use of oblique regression estimators, still it is accepted the method is viable 
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provided both instrumental and intrinsic scatter are known (e.g., Carroll et 
al, 2006, Chap. 3, §3.4.2; Buonaccorsi, 2010, Chap. 4, §4.5). 

With regard to heteroscedastic data, an inspection of Tables [2J15] shows 
that for lower data dispersion (RB09 sample) the values of regression line 
slope and intercept estimators, deduced for weighted (Tables [2J13]) and un- 
weighted (Tables HHS]) data, are systematically smaller in the former case 
with respect to the latter, but are still in agreement within =pcr. For larger 
dispersion data (Sal09 sample) no systematic trend of the kind considered 
appears, but the values of regression line slope and intercept estimators are 
still in agreement within =pcr for O, R, and B models. It may be a general 
property of the regression models considered in the current attempt or, more 
realistically, intrinsic to the samples selected for the application performed 
in section [3J 

The reliability ratios, Eqs. (I118p and (I119p . have been calculated for all 
sample data and the inequalities, P20 > 0.92, p 2 > 0.91, hold in any case 
except p 2 > 0.86 for the Sal09 sample, which implies poorly biased regression 
line slope and intercept estimators for the samples considered using Y and 
X models and, a fortiori, using C, O, R, and B models. 

5 Conclusion 

From the standpoint of a unified analytic formalism of bivariate least 
squares linear regression, extreme structural models have been conceived as 
a limiting case where the instrumental scatter is negligible (ideally null) with 
respect to the intrinsic scatter. 

Within the framework of the well known additive error model (e.g., Car- 
roll et al, 2006, Chap. 1, §1.2, Chap. 3, §3.2.1; Buonaccorsi, 2010, Chap. 4, 
§4.3), the classical results presented in earlier papers (Ial90; FB92) have been 
rewritten in a more compact form using a new formalism in terms of weighted 
deviation traces which, for homoscedastic data, reduce to usual quantities, 
leaving aside an unessential (but dimensional) multiplicative factor. 

Regression line slope and intercept estimators, and related variance esti- 
mators, have been expressed in the special case of uncorrelated errors in X 
and in Y for the following models: (Y) errors in X negligible (ideally null) 
with respect to errors in Y; (X) errors in Y negligible (ideally null) with 
respect to errors in X; (C) oblique regression; (O) orthogonal regression; (R) 
reduced major-axis regression; (B) bisector regression. 

Related variance estimators have been expressed for both non normal 
and normal residuals and compared to their counterparts determined for 
functional models (Cll). Asymptotic expressions have been also found to 
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provide a better approximation with respect to earlier attempts (Y66; Y69). 

Under the assumption that regression line slope and intercept variance 
estimators for homoscedastic and heteroscedastic data are connected to a sim- 
ilar extent in functional and structural models, the above mentioned results 
have been extended from homoscedastic to heteroscedastic data. In absence 
of a rigorous proof, related expressions have been considered as approximate 
results. 

An example of astronomical application has been considered, concerning 
the [0/H]-[Fe/H] empirical relations deduced from five samples related to 
different populations and/or different methods of oxygen abundance deter- 
mination. For low-dispersion samples and assigned methods, different regres- 
sion models have been found to yield results which are in agreement within 
the errors (=Fcr) for both heteroscedastic and homoscedastic data, while the 
contrary has been shown to hold for large-dispersion samples. In any case, 
samples related to different methods have been found to produce discrepant 
results, due to the presence of (still undetected) systematic errors, which 
implies no definitive statement can be made at present. 

Asymptotic expressions have been found to approximate regression line 
slope and intercept variance estimators, for normal residuals, to a better ex- 
tent with respect to earlier attempts (Y66; Y69). Related fractional discrep- 
ancies have been shown to be not exceeding a few percent for low- dispersion 
data, which has grown up to about 10% in presence of large-dispersion data. 

An extension of the formalism to generic structural models has been left 
to a forthcoming paper. 

Acknowledgements 

Thanks are due to G.J. Babu, E.D. Feigelson, M.A. Bershady, I. Lavagnini, 
S.J. Schmidt for fruitful e-mail correspondance on their quoted papers (FB92; 
AB96; Lavagnini and Magno, 2007; Sal09; respectively). The author is in- 
debted to G.J. Babu and E.D. Feigelson for having kindly provided the er- 
ratum of their quoted paper (Feigelson and Babu, 1992) before publication 
(Feigelson and Babu, 2011). 

References 

[1] Akritas, M.G., Bershady, M.A., 1996. ApJ 470, 706 (AB96). 

[2] Brown, P.J., 1993. Mesaurement, Regression and Calibration, Oxford 
Statistical Science Series 12, Oxford Science Publications. 



34 



[3] Buonaccorsi, J. P., 2010. Mesaurement Error: Models, Methods and Ap- 
plications, Chapman & Hall/CRC. 

[4] Caimmi, R., 2010. larXiv: 1008.20571 

[5] Caimmi, R., 2011. New Astron. 16, 337 (Cll). 

[6] Carroll, R.J., Ruppert, T.D., Stefanski, L.A., Crainiceanu, CM., 2006. 
Mesaurement Error in Nonlinear Models, Monographs on Statistics and 
Applied Probability 105, ed. Chapman & Hall/CRC. 

[7] Fabbian, D., Nisseu, P.E., Asplund, M., et al, 2009. A&A 500, 1143 
(Fal09). 

[8] Feigelson, E.D., Babu, G.J., 1992. ApJ 397, 55 (FB92). 

[9] Feigelson, E.D., Babu, G.J., 2011. ApJ 728, 72 (FB92, erratum). 

[10] Fuller, W.A., 1987. Mesaurement Error Models, ed. J. Wiley & Sons 
(F87). 

[11] Garden, J.S., Mitchell, D.G., Mills, W.N., 1980. Anal. Chem. 52, 2310. 

[12] Isobe, T., Feigelson, E.D., Akritas, M.G., Babu, G.J., 1990. ApJ 364, 
104 (Ial90). 

[13] Kelly, B.C., 2007. ApJ 665, 1489. 

[14] Lavagnini, I., Magno, F., 2007. Mass Spectrometry Rev. 26, 1. 

[15] Miller, R.P., 1966. Simultaneous Statistical Inference, New York, 
McGrew-Hill. 

[16] Osborne, C, 1991. International Statistical Review 59, 309. 

[17] Rich, J.A., Boesgaard, A.M., 2009. ApJ 701, 519 (RB09). 

[18] Schmidt, S.J., Wallerstein, G., Woolf, V.M., Bean, J.L., 2009. PASP 
121, 1083 (Sal09). 

[19] York, D., 1966. Canadian J. Phys. 44, 1079 (Y66). 

[20] York, D., 1967. Earth Plan. Science Lett. 2, 479. 

[21] York, D., 1969. Earth Plan. Science Lett. 5, 320 (Y69). 



35 



Appendix 



A Euclidean and statistical squared residual 
sum 

For homoscedastic data, the sum of squared (dimensional) Euclidean dis- 
tances between observed points, Pj(Xj,Yi), and adjusted points on the es- 
timated regression line, Pi(xi,yi), jji = axi + b, is expressed as (e.g., F87, 
Chap. 1, §1.3.3; FB92; Cll): 

(n-2)R = J2{(Y i -Y)-a(X l -X)} = S 02 + (a) 2 S 20 - 2aS n ; (120) 
i=i 

where R is denoted cLS S vv 111 the above quoted earlier attempt. 

The sum of squared (dimensionless) statistical distances (e.g., F87, Chap. 1, 
§1.3.3) between the above mentioned points, Pj(Xj,Yj) and Pi(xi,yi), reads 
(Cll): 



Tr = W 



12V 



S 2 + (a) 2 S 20 - 2aS u 
which, for heteroscedastic data, takes the general expression (Cll): 

T ii = W 02 + (a) 2 W 20 -2aW 11 ; (122) 
accordingly, the extension of Eq. f )120p to heteroscedastic data reads: 

Woo 

which, in the limit of homoscedastic data, Wi = W = W, 1 < i < n, 
Woo = nW = nW, W pq = WS pq , via Eqs. (HO]), (HP, reduces to Eq. (jI2D]l . as 
expected. 



B Equivalence between earlier and current 
formulation 

Let oblique regression models be taken into consideration under the fol- 
lowing restrictive assumptions: (1) homoscedastic data; (2) uncorrelated 
errors in Y and in X; (3) normal residuals. Accordingly, the regression 
line slope variance estimator is expressed by Eq. (J57J) where the function, 
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6(a c ,a Y ,ax), may be different for different methods and/or models, as 
shown in Table [TJ Aiming to a formal demonstration, some preliminary 
relations are needed. 

In terms of dimensionless ratios, using Eqs. (I2"4"]l and (|38|) . Eq. (11201) trans- 
lates into: 



(n — 2)R ax a dx — a a — ay 

aSu a ay a dy 



where the following identities: 



ax — a a — ay ax — cly ax — a a — ay 

a ay ay a dy 

dx — d a — dy dx — dy dx — d a — dy dx — d dx — d 



a ay 



ay 



a 



ay 



ay 



a 



(124) 

(125) 
(126) 



may easily be verified. 

In the case under discussion of oblique regression models, a = Sq, the 
following inequalities hold (Ial90): 



dx ^ d(j ^ dy ; Su > 
dx < < a Y ; Su < 



(127) 
(128) 



which makes the left-side hand of Eq. (I124p always positive provided Su ^ 0. 

Using the method of partial differentiation, the regression line slope vari- 
ance estimator in the case under discussion is (Cll): 



{a>c) 
n-2 



(Su) 2 



d c S 



ii 



(129) 



and the substitution of Eqs. flU]) and flMD into ffT2H . using (jr25]) and (jT2E|) 
yields after some algebra: 



2 _ l a Cj 



n-2 



«x - ac Qc - ay ax - ac ac - Qy 

A A "T « a 



(130) 



ac ay Q*C ay 

from which the following is inferred by comparison with Eq. (f57"|) : 

e(a c ,a Y ,ax) = 2 — ; ; ; (131) 

ac ay 

as listed in Table [TJ 

Using the method of moments estimators, the elements sample covariance 
matrix are: 



S 



m X x 



20 



5, 



71 — 1 



myy 



02 



71 — 1 
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m xy = ttiyx 



n — 1 



(132) 



which, in terms of the variance estimators, (& X x)s (true x values distribution), 
(&xx)f (measure error x values distribution), (<t w )f (measure error y values 
distribution) via Cp = ((T yy )F/(a xx )p, an d regression line slope estimator, dc, 
are expressed as: 

m X x = (o- X x)s + (o- X x)f ; (133a) 
m YY = (a c ) 2 (a xx ) s + (c f ) 2 (<3- x . x .)f ; (133b) 
m XY = mYX = a c (a xx ) s ] (133c) 

for further details and specification of the model refer to the parent paper 
(F87, Chap.l, §1.3.2). 

The substitution of Eqs. ffT3"2l and ffT3"3l into flT2"D|) yields: 

(n - 2)R C = (n - l)[(d c ) 2 + (c F ) 2 }(a xx ) F ; (134) 

where the variance ratio, (cp) 2 , may explicitly be expressed using Eqs. (j!32j) 
and f)133p . The result is: 

2 a c (5o2 - a c S n ) , 1QK x 

( c f) = — ^ — « — ; ( 135 ) 

which, using Eq. (11201) and performing some algebra, takes the equivalent 
form: 

(a c ) + (c F ) = — -~ q— ; (136) 

a C^20 ~ *J11 

finally, the substitution of Eq. (I136p into (jl34p yields: 

o-^ f = ~ ( ttz— ; 137 

[n - l)ac 

where the dependence on the variance ratio, (cf) 2 , has been eliminated. 

In the limit of large samples (n>l, ideally n — > +oo) where, in addition, 
Su 7^ 0, the regression line slope variance estimator is (F87, Chap. 1, §1.3.2): 

{°a c ? = 1 : 77-, \ 12 {[(Zxxh + (&MRc - (Qc) 2 [(Mf] 2 } ; (138) 
n-1 [{a xx ) s \ 2 1 > 

and the substitution of Eqs. ([EH) . (jT52]l . (jT55]l . ffTBTj) . into ffT3"8l yields after 
some algebra: 



(«c) 2 



n — i 

2 



Q-x - ac ac - d Y ax - ac dc - ay 1 / ac - a Y 
dc dy dc dy n — 1 I dy 



(139) 
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from which the following is inferred by comparison with Eq. ( 15 7p : 

r\(* . a \ ax - a c Qc - Qy 1 f a c - ay V , 1/ln x 

e(a c ,a Y ,ax) = — ; ; 1 — ; ; (140) 

ac ay n — 1 y ay / 

as listed in Table [TJ 

On the other hand, the regression line slope variance estimator reported 
in an earlier attempt [FB92, Eq. (4) therein] reads: 



/i. n2 ( a c) 2 



n 



(n - 2)i? c 00(5*02 - a c Sii) (n - 2)R C 



acSu (cs) 2 ac(5V 2 

2" 



n - 2 (a c ) 2 / S02 - ac^n 



n - 1 (.Sn) 2 V (cs) 2 J 
where cs = c and the counterpart of Eq. (I135p holds (Cll): 

c 2 = c 2 = ® C ( S ° 2 ~ dcSu "> . ( 142 ) 

00^20 — Sll 

and the substitution of Eqs. (11241) . (11421) . into (1141 j) . after some algebra yields 
Eq. (I139p . Then the regression line slope variance estimator, expressed by 
Eq. (I14ip . coincides with its counterpart deduced by use of the method of 
moment estimators, expressed by Eq. (11391 ). 

Using the method of least squares estimation, under the assumption that 
the entire covariance matrix is known, the regression line slope variance es- 
timator reads (F87, Chap. 1, §1.3.3): 

(^ac) 2 = — ~~r / : 1 N9 {rhxx^vv + (cr xx ) F a vv ~ (o^) 2 } ; (143) 
n-1 [mxxY 1 J 

&VV = {(?yy)F + (^c) 2 Wxx)f ~ 2a C (cT X y) F ] (144) 

o- xv = {^x V )f ~ a c {<Jxxh ; (145) 

where rhxx is the maximum likelihood estimator for (cr xx ) s , rhxx = (^xx)s- 
In the special case under consideration, ((J yy ) F = (c F ) 2 (a xx ) F , (cr xy ) F = 0, 
Eqs. (033), flUD, (IHS]), reduce to: 



(<3"a c ) 2 = — — T 7 : N9 + (c^f] <3™ - (a c ) 2 [(cr xx ) F } 2 ) ; (146) 

n-1 (m xx ) 2 i > 

o m = {{a c ) 2 + {c F ) 2 ]{a xx ) F ; (147) 

&xv = -ac((Txx)F ; (148) 
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if, in addition, least squares estimators are linearly related to corresponding 
moments estimators, the following relations hold: 



[(o^ujisc — Cx v [(<5"xx)u] 



U = F,S 



(o'w)isc — C v (a 



vv /mme j 



(149a) 
(149b) 



where the indices, lsc, mme, mean least squares estimators and methods of 
moments estimators, respectively. 

The substitution of Eq. (IT391 into (TTO yields: 



1 



1 



, v -, , {[^(Ws + Cx f (o"^)f] C v [(a c ) 2 + (c F ) 2 ] 

x (<r M ) F - (a c ) 2 [tf Xp (a TO ) F ] 2 } ; (150) 



where the index, mme, has been omitted for simplifying the notation. 
The substitution of Eq. (11341) into (11501) produces: 



1 



1 



X 



acJ n-l[(a xx ) s } 2 
C v 

x s _ • ., \ , 



C 



Cx F 



)s + -pr L (^xx)F 



rite - {ac) — — — 

n-1 (Cx 3 y 



■[{<?xx)f} 2 



(151) 



where the estimators, ((J X x)f and (a xx )s, are expressed by Eqs. fl!32jl . (I133p . 
( 11351) . (11371) . Accordingly, the explicit expression of Eq. (I15ip after some 
algebra reads: 



Cr 



(Sn) 2 \c 



x s L 



a c C 
(C Xf ) 2 (a c ) 2 

(c Xs y n -i 

where the restrictive assumptions: 



Sn Cx F S02 — acSn 



x a 



C Xs 

make Eq. (11521) reduce to: 



C 



(Sn) 2 



Sn S02 — O'cSn 
a c (c F ) 2 



x s 



Rc 



(cf) 2 



n — 1 

n - 2 ' 



n - 2 

?2—l 



-ft 



(152) 



(153) 



n — 1 



5*02 — acSi] 
(cf) 2 



l2 



(154) 
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which formally coincides with the result of an earlier attempt where cs = c 
appears instead of cp [FB92, Eq. (4) therein]. 

Finally, the substitution of Eqs. (11241) and (I135p into (11541) yields after 
some algebra: 



(<^ c ) ; 



(«c) 2 
n- 2 



ax - ac | ac - Qy | ax - «c ac - Qy | 1 



a c 



Qy 



a c 



a Y 



n — 1 



ac - a-Y 



(155) 



from which the following is inferred by comparison with Eq. (15 7\i : 

2 



B(a c ,a Y ,ax) = — h 



n — 1 



Qc - Qy 

» AY . 



(156) 



a c a Y 

as listed in Table [TJ 

The revised version of the regression line variance estimator reported in 
an earlier attempt [FB92, Eq. (4) therein, erratum 2011] reads: 

^2 



71-2 

1 



(n - 2)# c 
acSn 

(«c) 2 



n — 1 (ac) 2 + c 2 



n - 2 (a c ) 5 



n — 1 (ac) 2 + c 2 



(n - 2)i? c 



acS* 



li 



(157) 



and the substitution of Eqs. (11241) . (11421) . into (11571) . after a lot of algebra 
yields: 



ac) | a x - ac ac - ay 



n-2 
1 



71 — 1 



ac ay 
ax — ac ac — ay 



a c 



ay 



+ 



71—1 



ac — a Y 

4 a Y , 



:i58) 



from which the following is inferred by comparison with Eq. flBT|) : 

2" 



e(a c ,a Y ,ax) 



1 



TJ — 1 



ax - ac ac - ay + 1 



ac 



ay 



n — 1 



ac - ay 
» a Y . 



(159) 



as listed in Table [TJ 

The asymptotic expression (n — > +oo) of Eq. dl58[) is obtained neglecting 
the terms of lower order with respect to l/n. The result is: 



!(/r)- 1 a x ~ ac | ac ~ Qy 



n-2 



a c 
41 



a Y 



(160) 



which implies Q(ac,ay,ax) = 0, as listed in Tabled) The asymptotic for- 
mula, Eq. (11601) . coincides with an approximation reported in earlier attempts 
(Y66; Y69) for Y models and makes a better approximation for X, C, O, R, 
and B models. 



C Data-independent residuals 

Let «a, u-q, be independent random variables, /a^a^wa, /b^b) d«B, 
related distributions, u* A , u B , related expectation values, and ma, ub, re- 
lated estimators. The random variable, u = uaUb, obeys the distribution, 
f(u)du = Ju /a(«a)/b('Wb) d«A d«B, where U is the domain for which the 
product, ma^b, equals a fixed u. According to a theorem of statistics, the 
expectation value is u* = (maMb)* = u *a u b an d the related estimator is 
u = u~a~ub ~ UaUb- 

The special case of the arithmetic mean reads u = ua^b ~ ub or: 

n 1 n 1 n 

~ Y\{uA)i{uB)i ~ - Y\{uA)i ~ Y\(uB)i ; (161) 

with regard to u\ and ub samples with population equal to n. 

With these general results in mind, let Eqs. (1281) . fT4"2l) . flBTj) . be rewritten 
into the explicit form [Ial90, Eqs. (A4)-(A6) therein] 2 } 

"<» ' ' = £ {(Xi - F ) 2 [(^ - F ) - a x(A, - X)] 2 } ; (163) 



20) i=1 

2 

l 



aa Y a x = £ {( X i ~ X )X ~ Y )m — Y) — a Y (X t - X)] 

<- ) 20'- , ll j = i 



X 



[(Y i -Y)-a x (X i -X)]} ; (164) 



where (dimensional) residuals related to Y and X models are enclosed in 
square brackets via Eqs. ([25]) and ( 13"9~|) . respectively. 

If residuals are independent of coordinates of observed points, = 
(Xi, Yi), 1 < % < n, then the particularization of Eq. (I16ip to «a = {Xi — X) 2 , 

2 With regard to the above quoted Eqs. (A4)-(A6), it is worth noticing ay, ax, are 
denoted as ft, ft, respectively, and ft has to be replaced by (ft)^ 1 in Eq. (A6) to get the 
right dimensions and to be consistent with the expression of the covariance term (Ial90, 
note to Table 1 therein). 
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(Y t - F)MX - X)(Yi - Y); u B =_[(Yi - Y)_- a Y (X t - X)f , [(Y t - Y) - 
a x (X t - X)] 2 , [(Yt - Y) - a Y (X - X)][(Yi -Y) - a x (X - X)}; respectively, 
makes Eqs. (1162p - (1164p reduce to: 

-i -i n n 

(^a Y ) 2 = "To \2 - X) 2 EK y < - F ) - «y(X 4 - X)] 2 ; (165) 

"t^aoj i=1 i=1 

i -i n n 

fe) 2 = - F ) 2 EK y < -Y)- «x(X - X)] 2 ; (166) 

n Wu) i=i i=i 



-ri n n 

n ^20^11 i= i i=1 

x [(Yi-Y)-a x {Xi-X)] ; (167) 

as outlined in an earlier attempt (Ial90). 

Using Eqs. (ITBl . fl24^) . (138]) . while performing some algebra, Eqs. (1165p - 
( 11671) may be cast into the form: 

/- \2 (ay) 2 «x - Qy n „ Q , 

{cra Y > = ; ; (168) 

n ay 

t~ \2 ( Q x) 2 «x ~ «y . / 1KQ x 

K x ) = — 7 , (169) 

71 CIy 

(a Y ) 2 a x -a Y nvrA 

<7a Y a x = ; ; (170) 

n ay 

which provide correct asymptotic (n — > +oo) formulae but understimate the 
true regression coefficient uncertainty in samples with low (n ~ 50) or weakly 
correlated population (FB92). 

As shown in Appendix[Bl asymptotic expressions of Eqs. ([2"6"j) and f l4Tjj) im- 
ply 0—7-0 and, in this limit, Eqs. (I168p and (I169p . respectively, are matched 
provided n therein is replaced by (n — 2). Accordingly, Eqs. (I168p - fll70p trans- 
late into: 

/ ±. \2 (ay) 2 Qx ~ Qy . M71 x 

K Y ) = r — : , (171) 

n — 2 ay 

(*■ \2 (ax) 2 Qx - Qy nvoA 
^ = n-2 ay ' (1?2) 
(a Y ) 2 a x - a Y , 17Q x 
CT ™ = n-2 a Y ; (173) 

which are expected to yield improved values for samples with low or weakly 
correlated population. 
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With regard to oblique regression models, the substitution of Eqs. (1171 j) - 
(11 731) into (1M|) yields after some algebra: 

t * \2 (»c) 2 

M = —2 

X L 2{a Y )\a c f{A XY fA xc A CY \ 

\ XY+ 4(a Y ) 2 (a c )M xc A CY +[a Y axA C Y-(ac) 2 Axc] 2 J A ' 

where the identity: 

A X y = A xc + A CY + A xc Acy ; (175) 
may easily be verified. Accordingly, Eq. (jl74j) may be cast under the form: 

(aa c ) 2 = ^[Axc + ^CY + e(a c ,a Y ,ax)] ; (176) 

6(a c , a Y , ax) = A X c^cy 

x Ji + 2(a Y ) 2 (a c ) 2 (Ax Y ) 2 1 

\ 4(a Y ) 2 (a c ) 2 A X c^cY + [a Y a x AcY - (a C ) 2 ^xc] 2 J 

where an asymptotic expression of Eq. (1 1 76 [) implies — > and, in this limit, 
Eq. (1 1 76 [) coincides with ( )64|) as expected. 

With regard to reduced major axis regression models, the substitution of 
Eqs. (jl7ip - (H73p into ( 156]) yields after some algebra: 



(-aJ 2 = ^V^ 1 + - ; (178) 
n — 2 2 \ ax J 

where the identity: 

ax -4xy + 1 

may easily be verified. Accordingly, Eq. (I178[) may be cast into the form: 

«) 2 = ^^[A X R + A RY + e(a R ,a Y ,a x )] ; (180) 
9(a R , a Y , a x ) = A XK A RY - ~ [ AxY f ; (181) 

Z Axy + 1 

where an asymptotic expression of Eq. (HSUj) implies B — > and, in this limit, 
Eq. (I180p coincides with ( 186]) as expected. 
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With regard to bisector regression models, the substitution of Eqs. (I17ip - 
(11731) into (11041) yields after some algebra: 



cr, 



(a B ) 2 A XY (a Y ) 



< + (ax) 2 a| + (ay) 2 (ax) 2 
a 2 + (a Y ) 2 a 2 + (a x ) 2 (a Y ) 2 + 



n - 2 (a Y + a-x) 2 
which, using Eq. (I179p . may be cast under the form: 



;i82) 



6(a B , a Y , a x ) = 



[A XB + A BY + 0(a B , a Y , a x )] 
" (^xy + 2) 2 



a 2 + (a x ) 2 



a 2 + (a Y ) 2 (a x ) 2 



a 2 + (a Y ) 2 a 2 + (a x ) 2 (a Y ) 5 



A XY + AxbAby 



;i83) 



;i84) 



where an asymptotic expression of Eq. (I183p implies — > and, in this limit, 
Eq. (I183p coincides with (I104p as expected. 



D Special cases of oblique regression 

With regard to homoscedastic data, special cases of oblique regression 
may be considered starting from the expression of regression line slope and 
intercept estimators, Eqs. (|55i) and (|56|) . and related variance estimators, 
Eqs. (I57p and (1581) for normal residuals or (1591) and (|60|) for non normal resid- 
uals. As outlined in the parent paper (FB92), the special cases, c — > +oo, 
c — > 0, c — > 1, correspond to errors in X negligible with respect to errors in 
Y, errors in Y negligible with respect to errors in X, and orthogonal regres- 
sion, respectively. In addition, the limiting case, c — > ^axa Y , corresponds 
to reduced major-axis regression (e.g., Ial90; Cll). An exhaustive discussion 
related to regression line slope and intercept estimators, can be found in an 
earlier attempt (Cll). Finally, the limiting case, c — > c eq b, where c eq b is 
expressed by Eq. (llOip . corresponds to bisector regression. The result is: 



lim dc = a Y ; 

c— >+oo 

lim ac = a B ; 



lim ac 

c— >0 



a x 



limac 



a Q 



\im 



ac 



o,r ; (185a) 
(185b) 



where related models are denoted by the indices, Y, X, O, R, B, respectively. 

Concerning regression line slope variance estimators for normal residuals, 
the following relations can be inferred from Eq. (1571) : 



lim J(a ac ) N ] 2 



lay; 



c— »+oo 



n 



a x — a Y 
a Y 



+ 0(a Y , a Y , a x ) 
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lim[(a ac ) N ] 



M(o- do ) N ] 

c— >1 



n-2 

(ao) 2 
n-2 



ax — cly 

fly 



+ <d(a x ,a Y ,a x ) 



— 1 h B(ao,a Y ,a x J 



a 



ay 



lim [(<3-a c ) N ] 2 - ( " K 



C->V a X a Y 



lim [(<t, 



n-2 

2 



ax — Qr . or — o>y . ~ ~ x 
; 1 ; h B(or, a Y ,a x ) 



a c jNj 



n-2 



or ay 
Ox — Ob ^ Ob — cly 



+ 0(a B ,a Y , a x ) 



(187) 
(188) 
; (189) 
(190) 



Ob q>y 

where the function, G, is listed in Table [T] for different methods and/or 
models. 

A comparison between Eqs. fl26|) . fj40l . and (11861) . (11871) . respectively, 
yields: 

12 _ n 12 . (igi) 



hm [(<Ja c ) N ] = [(^ y )n] z 

c— >-+oo 

lim[(<Ta c ) N ] 2 = [(<^ x )n] 2 I 



and, on the other hand: 



2 . 



lim[(a ac ) N ] 2 = [(<7a jNj ; 

c— >1 

lim_[(<3- ac ) N ] 2 = [(^Jn] 2 ; 



Jim [(<xa c ) N ] 

- ?< --cqb 



a B jNj , 



(192) 

(193) 
(194) 

(195) 



by definition of orthogonal regression (e.g., Carroll et al., 2006, Chap. 3, 
§4.4.2), reduced major-axis regression (e.g., Ial90; Cll), and bisector regres- 
sion (e.g., Ial90). 

Concerning regression line intercept variance estimators for normal resid- 
uals, the following relations can be inferred from Eq. (158 ft : 



c— >+oo 



n — 2 ay i>oo 

q X Qx - QY ■S'll /^x,2r/~ x n2 



! in }[(<%)N 



n — 2 ay 5*00 

2 a O 



c-5-1 



n-2 



Qx ~ Qq _|_ Qq — Qy 



a 



ay 



<->oo 



lim [(a ir ) 

C-S-x/ a X a Y 



2 _ °R 



Qx ~ a R _|_ a R ~ Qy 



or 



ay 



Soo 



+ W 2 [(cra R ) N ] 2 



lim [(a 



^cqb 



6c i 



«B 

n-2 



«x — a B _|_ a B — Qy 



a B 



a Y 



•Sii ,— . 



5, 



a B jNj , 



(196) 
(197) 
(198) 

(199) 
(200) 



oo 
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due to Eqs. (HB5J» and (fT^ -flTO. 

A comparison between Eqs. ([27}, fl4Tl) . and (11961) . (11971) . respectively, 
yields: 

C 1^J(%>] 2 = [(^ y )n] 2 ; (201) 

lim[(<T Sc ) N ] 2 = [(^ x )n] 2 ; (202) 

and, on the other hand: 

lim[(a gc ) N ] 2 = [(a So ) N ] 2 ; (203) 

C lim [(ag N ] 2 = [(^) N ] 2 ; (204) 

Jjm [(a &c ) N ] 2 = [(a £ >] 2 ; (205) 

by definition of orthogonal regression (e.g., Carroll et al., 2006, Chap. 3, 
§4.4.2), reduced major-axis regression (e.g., Ial90; Cll), and bisector regres- 
sion (e.g., Ial90). 

The intercept variance estimators for special cases of oblique regression, 
are expressed by a single formula characterized by different dimensionless 
coefficients, 71^, 72/fe, where k = 1,2,4, for Y, X, O, models, respectively 
(Ial90). The extended expressions for oblique regression, where k = 6 for C 
models, read: 

= occ* r2nfi ^ 

716 a Y [4(a Y ) 2 c 2 + (a Y ax-c 2 ) 2 ] 1 /2 ' { U0) 

726 [4(a Y ) 2 c 2 + (a Y a x - c 2 ) 2 ]V2 ' 
which, for the above mentioned special cases, reduce to: 

711 = lim 7l6 = 1 ; (208) 

721 = lim 726 = ; (209) 

712 = hm 7i6 = ; (210) 

c— >0 

722 = lim 726 = 1 ; (211) 

c— >0 

according to their counterparts expressed in the parent paper (Ial90) and, in 
addition: 

714 = lim 7i 6 = - ; ; (212) 

a Y [4(a Y ) 2 < + (a Y a x - a 2 ) 2 J i/2 

724 = lim 726 ; 2A211/2 ; (213) 

[4(a Y ) 2 a 2 + (a Y a x - a 2 ) 2 ] 1 / 2 
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where a u = 1 is the (dimensional) unit slope, according to their counterparts 
expressed in the parent paper (Ial90) provided \ay \ is replaced by dy therein. 

The validity of Eqs. f !208[) - (12 1 3[) implies the validity of the following rela- 
tions: 

= «) 2 ; (214) 

Sg(*Ba) a = (-S X ) 2 I (215) 
lim(^ c ) 2 = (a iQ r ; (216) 



in the general case of non normal residuals. The above results cannot be 
extended to R and B models i.e. k = 5,3, respectively, due to use of 
the 5- method for determining variance estimators (Ial90), which implies 
lim Uc ^ Uu (cTu c ) 2 ^ (<3-«u) 2 ; u = a,b; U=R,B. 

With regard to heteroscedastic data, the above results can be extended 
starting from the expression of regression line slope and intercept estima- 
tors, Eqs. (1701) and (1711) for normal residuals, which yields counterparts of 
Eqs. (jl96| - (ll99|) where n(uj x ) pq /({u x ) Q appears in place of S pg and a' Y = 
{wx)u/{wx)2o in place of dy = (vT y ) 11/(^)20- A similar procedure can be 
used for non normal residuals, starting from Eqs. (T72l) and (173]) . 



E Cll erratum 



Due to the occurrence of printing errors, Eqs. (147) and (152) in an earlier 
attempt (Cll) were lacking of a dimensionless factor and must be corrected 
as follows: 



{W x )o2 



1 



n-2 (w x ) 2 o { (A TO J 
2 5*02 



sgn[(w x ; 



1 



A,, 



71-25 



20 



(As) 



sgn(Sn) — 
As 



(147) 
(152) 



which are equivalent to their alternative expressions, Eqs. (149) and (154) 
therein, respectively. 

Sample FB09 listed in Table 2 therein has to be read as RB09. 
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Figure 2: Regression lines related to [0/H]-[Fe/H] empirical relations de- 
duced from two samples with heteroscedastic data, RB09 and Sal09, and 
three samples with homoscedastic data (using the computer code for het- 
eroscedastic data), Fal09, cases LTE, SHO, and SHI, indicated on each panel 
together with related population and model captions. The regression lines 
related to six different methods are shown for each sample on the top right 
panel. For further details refer to the text. 
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Figure 3: Regression lines related to [0/H]-[Fe/H] empirical relations de- 
duced from two samples with heteroscedastic data (with instrumental scat- 
ters taken equal to related typical values), RB09 and Sal09, and three samples 
with homoscedastic data, Fal09, cases LTE, SHO, and SHI, indicated on each 
panel together with related population and model captions. The regression 
lines related to five different methods are shown for each sample on the top 
right panel. For further details refer to the text. 
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